Armed Polite Society

Main Forums => The Roundtable => Topic started by: Ben on August 07, 2020, 10:08:06 AM

Title: Excel Forces Scientists to Rename Genes
Post by: Ben on August 07, 2020, 10:08:06 AM
Interesting article, and you have to wonder how many things have slipped by in this and other fields that rely on the database.

https://www.theverge.com/2020/8/6/21355674/human-genes-rename-microsoft-excel-misreading-dates
Title: Re: Excel Forces Scientists to Rename Genes
Post by: WLJ on August 07, 2020, 10:46:32 AM
No one thought of merely formatting the cells so as to not do this?  Takes all of 3 seconds to do the entire spreadsheet, done. You can even do it after the fact :facepalm:
Title: Re: Excel Forces Scientists to Rename Genes
Post by: Pb on August 07, 2020, 10:50:22 AM
No one thought of merely formatting the cells so as to not do this?  Takes all of 3 seconds to do the entire spreadsheet, done :facepalm:

I would like to know how to do that.  That has been very annoying.
Title: Re: Excel Forces Scientists to Rename Genes
Post by: WLJ on August 07, 2020, 10:52:53 AM
From the Home Tab:  Format --> Format Cells --> Text

You can do just a single cell or a selection of cells or an entire column or roll or the entire spreadsheet at a time.
Title: Re: Excel Forces Scientists to Rename Genes
Post by: zahc on August 07, 2020, 10:58:56 AM
No one thought of merely formatting the cells so as to not do this?  Takes all of 3 seconds to do the entire spreadsheet, done. You can even do it after the fact :facepalm:

If only it were that easy. That's the way it SHOULD work, and sometimes it's the way it does work. But there are also bugs in Excel that have persisted for decades, where data is converted upon importing it. The data is mangled and cannot be recovered. Nobody wants this behavior and there is no reason for it, but it's a very common problem.

In a sane spreadsheet, the source data and the human representation of the data should be two separate things. It should always be possible just to change the formatting if you don't like it. The spreadsheet should never over-write the source data unless the user explicitly changes it through the formula bar. Did you notice how many times I said "should"? In reality, Excel irreversibly mangles certain data in a way that you cannot recover it. This happens mostly when importing data and people have been complaining about it for literally decades. What the researchers are doing is probably the right thing. I take that back; the right thing is to not use Excel. So what they are doing is the pragmatic thing. The entire scientific community changing it's practices is more likely than Microsoft fixing bugs in Excel.


Quote
You can do just a single cell or a selection of cells or an entire column or roll or the entire spreadsheet at a time.

If only this were true. But I promise, interact with enough different types of data, and you will run into data-mangling, and you will find out there's nothing you can do about it, as many others have. As a fun exercise, I suggest importing some data with that has a minority of fields with leading zeros, and/or whitespace. After that, I suggest importing data that was exported using a different version of Excel, or the same version of Excel in a different region. Or, try importing some genetic data...
Title: Re: Excel Forces Scientists to Rename Genes
Post by: WLJ on August 07, 2020, 11:04:30 AM
Can't remember off the top of my head where but there's, or was, an import special that would take care of that.
Title: Re: Excel Forces Scientists to Rename Genes
Post by: Brad Johnson on August 07, 2020, 11:34:19 AM
Paste Special > Paste as Text

Brad
Title: Re: Excel Forces Scientists to Rename Genes
Post by: WLJ on August 07, 2020, 11:41:16 AM
Paste Special > Paste as Text

Brad

Yeah, paste special, don't know why I typed import special. Guess I had import on my mind at the moment.
Would use it all the time moving data between spreadsheets.
Title: Re: Excel Forces Scientists to Rename Genes
Post by: zahc on August 07, 2020, 12:14:01 PM
Can't remember off the top of my head where but there's, or was, an import special that would take care of that.


It has bugs. That's the whole problem. I don't know why this is hard to believe.
Title: Re: Excel Forces Scientists to Rename Genes
Post by: cordex on August 07, 2020, 12:22:19 PM
Yeah, zahc is completely right.  If you use Excel a lot you run into bugs all the time.  Sometimes there are workarounds, sometimes no.
Title: Re: Excel Forces Scientists to Rename Genes
Post by: WLJ on August 07, 2020, 12:52:02 PM
It's not a bug, it's an undocumented feature
Title: Re: Excel Forces Scientists to Rename Genes
Post by: cordex on August 07, 2020, 02:17:54 PM
It's not a bug, it's an undocumented feature
Some of Excel's bugs are documented features (https://docs.microsoft.com/en-us/office/troubleshoot/excel/wrongly-assumes-1900-is-leap-year).
Title: Re: Excel Forces Scientists to Rename Genes
Post by: WLJ on August 07, 2020, 02:39:04 PM
I've been at this long enough to miss VisiCalc and those upstarts Lotus 123 v1 and Quattro v1.0
Title: Re: Excel Forces Scientists to Rename Genes
Post by: RocketMan on August 08, 2020, 01:07:06 PM
I've been at this long enough to miss VisiCalc and those upstarts Lotus 123 v1 and Quattro v1.0

Those are some programs I had almost forgotten.  Thanks for the reminder of how old some of us are getting.
Title: Re: Excel Forces Scientists to Rename Genes
Post by: WLJ on August 08, 2020, 01:19:28 PM
Quote
In January 1983, Lotus introduced Lotus 1-2-3 at a price of $495.

Run through an inflation calc than equals $1,281 in 2020 dollars. YIKES!

How many remember this little detail?

Quote
When Excel first appeared in 1985, it was offered only for the Macintosh. Jerry Pournelle, a well-known columnist for Byte (and science-fiction author), wrote (incorrectly but nonetheless prophetically): ‘Excel will make the Mac into a serious business machine.’

Yeah, that went real well

https://www.poynter.org/reporting-editing/2015/today-in-media-history-lotus-1-2-3-was-the-killer-app-of-1983/