TECHNICAL TIPS
for Genealogy

 

homecensusdirectories I members' websites I parish records I photos I resources I stories I hints & tips I useful links I wills

Gedcom problems  I  1752 Date Changes  I  Divorce  I  Plodweaver  I  Transcription Queries

Gedcom & Genealogy Programs

What is Gedcom?
Gedcom is a protocol or set of rules for the exchange of genealogical data. It was originated by the LDS Church who are into record gathering in a big way. It defines a record structure and fields used to store and move data and it does so using what can be thought of as the lowest common denominator - a simple text file. LDS developed the protocol for submitting data for the LDS Temple and Ancestral File but it has become a de-facto standard for exporting and importing data between genealogical computer programs. In an introduction to the Gedcom Standard documentation they say:

GEDCOM was developed by the Family History Department of The Church of Jesus Christ of Latter-day Saints ( LDS Church) to provide a flexible, uniform format for exchanging computerized genealogical data. GEDCOM is an acronym for GE nealogical D ata COM unication. Its purpose is to foster the sharing of genealogical information and the development of a wide range of inter-operable software products to assist genealogists, historians, and other researchers.

Most genealogical data describes people in terms of relationships and events e.g. John Smith was born on 18 June 1850, his parents were William Smith and Anne Hindle. In fact a Genealogy program doesn't "think" like that - it will record three people and a family. It stores the information about the people and then connects them to the family, so John would be connected to the family as a child, William would be connected as a husband, Anne as a wife. There will be a host of facts associated with the individuals and they need their pre-defined classifications. It is these parcels of information that Gedcom attempts to replicate so that the data can be turned into a text file and exported in a form that can be imported to another system.

To do this it defines a number of "tags" for the data. As an example the following fragment from a Gedcom file describes John Smith:

0 @I1@ INDI
1 NAM E John /SMITH/
    2 SURN SMITH
    2 GIVN John
1 SEX M
1 BIRT
    2 DATE 18 Jun 1850
    2 PLAC Rishton
1 CHR
    2 DATE 21 Jun 1850
    2 PLAC Great Harwood
1 DEAT
    2 DATE 12 Oct 1910
    2 PLAC Blackburn

I have indented some parts to make it clearer which date applies to which event and highlighted the tags. I think this is reasonably self-explanatory, it gives John's name, gender and his dates and places of Birth, Christening and Death.
This data can then be transported to another system and imported.

back to top

So what's the problem?

The problem is that Genealogy programs aren't written around Gedcom, that bit comes later! Different programs will major on different aspects and might not have made any provision for some events. They might also have used the Gedcom specification in a different (but legal) way, there is often more than one way to represent the same information. Taking John Smith, the record above was created using PAF5 (Personal Ancestral File Ver 5 - a program from LDS), I then exported it to Genopro, the program I normally use and exported it again and this is what I got:

0 @IND00001@ INDI
1 NAME John /Smith/
1 SEX M
1 BIRT
    2 DATE 18 JUN 1850
1 DEAT
    2 DATE 12 OCT 1910

I have lost all information about Christening and the places of Birth & Death. As a user of Genopro this shouldn't surprise me as there is nowhere in that program for me to enter Christening data or place data for Birth and Death. His name is still there but isn't split into SURName and GIVeN name although in Genopro I do enter Surname and Given name separately; it "knows" that the bit between the // is the Surname.

This explains the problem at its simplest level - if you are exchanging data between two different problems you might lose information. In some cases an importing program might create an error log or output warnings but how many people are going to work their way through all of them, some pretty obscure, to see what they are missing.

As an example I recently received a Gedcom and when I imported it to FTM2006 (Family Tree Maker) I got 983 lines like this:

WARNING:  line 3067: RIN 227: Name must have 0 or 2 slashes: 'Radulph/Raphe /SMITH/'.
ERROR 2:  line 7876: RIN 576: Unexpected tag 'TEXT' in Citation Structure.
                3  TEXT together with John and Mary Ann
WARNING:  line 10230: RIN 762: Name must have 0 or 2 slashes: 'Henry/Harry /SMITH/'.

The warnings are, in fact just that - I haven't lost any data it just isn't in an approved form. The / character within a name field is used to separate the Surname. The originator of this data was unsure about the given names so entered the alternatives separated by a / . Result - confusion for the computer, it doesn't know if in the first case the Surname is Raphe or SMITH (they are both enclosed by /s ). The error, on the other hand means I have lost some information, by going to line 7876 of the Gedcom file I might be able to work out what was meant and then enter it by hand in my program, after locating the individual involved.

An extract from the standard on Personal Names reads :

The surname of an individual, if known, is enclosed between two slash (/) characters. The order of the name parts should be the order that the person would, by custom of their culture, have used when giving it to a recorder. If part of name is illegible, that part is indicated by an ellipsis (...). Capitalize the name of a person or place in the conventional manner - capitalize the first letter of each part and lowercase the other letters, unless conventional usage is otherwise. For example: McMurray.
Examples:
William Lee (given name only or surname not known)
/Parry/ (surname only)
William Lee /Parry/
William Lee /Mac Parry/ (both parts (Mac and Parry) are surname parts
William /Lee/ Parry (surname imbedded in the name string)
William Lee /Pa.../

back to top

Dates

Dates represent a special challenge. Gedcom is only interested in recording them, in fact it makes extensive provision for stuff like the Hebrew, French Revolutionary, Roman, Julian and Gregorian Calendars (but not the Islamic or Chinese) and allows for special forms for approximate dates. So you could have a person whose birth was recorded in the Hebrew form but his death was recorded by means of the French revolutionary calendar. I don't, however, think there is a program that would work out how old he was when he died (actually I doubt if there is a program that would enable you to enter that sort of information, but that doesn't concern people who live in ivory towers and write standards). The point is that some programs will let you enter approximate dates such as ABT 1820, or BETween 1795 AND 1804, others won't. They are the ones that are going to calculate something and can't work with woolly dates. I recently received a Gedcom which included a DATe of Burnley, Lancs. How that happened I have no idea but it looks like the originating program had a very flexible approach to what constitutes a DATe. So what can I do about it?

Being realistic - not a lot. You have paid good money for the program you are using and have entered a lot of data with it. To switch now you will have to pay more money and even then you have to export your data and import it into the new system. You should, however be aware of the limitations of your program. In the case of Genopro as I explained previously there is nowhere to put Christenings or places or Birth or Death, so what do I do - I use the ubiquitous NOTEs field and finish up with something like this :

0 @IND00001@ INDI
1 NAME John /Smith/
1 SEX M
1 BIRT
    2 DATE 18 JUN 1850
1 DEAT
    2 DATE 12 OCT 1910
    2 NOTE Died at Blackburn
1 NOTE Born in Rishton
    2 CONT Christened - Gt Harwood, 21 June 1850

There is a place for NOTEs associated with DEATh, but not with BIRTh so the Place of Birth gets stuck in a general NOTE associated with the INDIvidual and this note is CONTinued on a second line to give the Christening data. I'm pretty sure that all the info will show up somewhere in most other systems, but probably not where the person using that program expects to see it.

Remember that you will suffer (but may not see) the problems associated with imported data but others will be on the receiving end of data you export and may think your data is corrupt when it's just program incompatibilities. Try to be aware of incompatibilities if you frequently exchange data with someone and if all else fails be prepared to re-enter some data by hand

After all I have said about Genopro you will probably wonder why I didn't dump it long ago, well that's the point - Gedcoms are only a small part of what I do with it . I like the visual interface and its ability to generate my websites, for me there are more swings than roundabouts.   Even upgrading is fraught with problems - I had a look and decided that the amount of notes that really should be moved into proper fields that are in the new version is too big a task to handle, and that's without using Gedcom to shift the data! I have used Genopro as my example because it is the one I use and I'm familiar with it. Other programs have similar, but different, problems. The important thing is to be aware of the issues and that there isn't a magic wand that will solve them.

back to top


"Fuzzy" dates

These are standard Gedcom abbreviations used for "Fuzzy" or imprecise dates:

Approximate Dates :  

ABT <DATE>  ABT =About, meaning the date is not exact.

CAL <DATE>  CAL =Calculated mathematically, for example, from an event date and age.

EST <DATE>  EST =Estimated based on an algorithm using some other event date.

 Date Ranges :  

BEF <DATE> AFT =Event happened after the given date.

AFT <DATE> BEF =Event happened before the given date.

BET <DATE> AND <DATE> BET =Event happened some time between date 1 to date 2. For example, BET 1904 AND 1915 indicates that the event state (perhaps a single day) existed somewhere between 1904 and 1915 inclusive.

FROM <DATE> TO <DATE> FROM=Event happened continuously between date 1 and date 2. For example, FROM 1904 TO 1915 indicates that the event state started in 1904 and ended in 1915.


Robert Calvert
4 March 2007

back to top


The 1752 Date Changes - Julian & Gregorian Calendars


Calendar

It is worth noting that during the currency of the Parish Registers the form of recording dates changed - the year used to run from March to March, not January to December so an entry for, say, 21 January 1563 would in modern form refer to 21 January 1564. It is pre-1753 that the confusion could occur.

In 1751 the year started on 25 March and ended on 31 December. In 1752 the year started on 1 January and ended on 31 December, however it appears that as far as Great Harwood's Parish Records are concerned the year change did not happen until the following year with the last birth of 1752 being recorded on 8 March 1752 and the first of 1753 being three weeks later on 29 March 1753.

The pre-1752 calendar is known as the Julian, or Old Style and dates are sometimes shown in modern documents as OS to indicate what year is intended. The post-1752 calendar is the Gregorian calendar.

Old Style dates are sometimes shown as 21 Jan 1563/64, however spreadsheets and software which carry out calculations based on dates cannot handle this format.  If you are looking at copies of origiaal records the dates before 1752 will be recorded in Old Style,  later dates in New Style format. In some cases where the dates have been transcribed later and where they are likely to be processed by computers to calculate ages etc. a conversion from Old to New style may have been made, you may have to examine the context in the absense of clarifying notes.

An Explanation

The following is based on a document I found on the Internet at the time of the Year 2000 scare regarding computer dates - much of the explanation related to Leap Years but does include a good explanation of the Julian and Gregorian calendars.

Julius Caesar (or more correctly, perhaps, Gaius Julius) who lived from 100-44 BC had a fascination with the calendar although he made some awful mathematical errors in his decrees regarding same. He named one of the summer months after himself.
Anyway, after meditation and consultation with his astronomers and others he realized that 365 days in a year was not quite accurate. Using the tools at his disposal he decided there had to be an extra quarter day tossed in each year to make things work out right. So the new calendar was devised according to his instructions. They went along that way for quite awhile, adding one day every four years to account for that left over bit each year. The error was not big enough to notice even over several hundred years.
Well, a few hundred years later the Pope called on one of his scholars and bright young men, a monk by the name of Dennis Aloysius and said "Dennis, figure out a new system of years to go by." Dennis thought about that for quite a while and after some serious calculations told the Pope that Jesus had been born in the 720th year of the Roman Era and that henceforth that would be known as year 1, and that therefore it was now the 520th year in the Christian era (520 AD).
Fast forward a thousand years or so, and Pope Gregory has been told by his advisers that the calendar is going to be adjusted again. As they explained it to Pope Gregory, Gaius Julius had it all wrong: instead of a year consisting of 365.25 days, it really only consisted of 365.2422 days, or 365 days, 5 hours, 48 minutes and 49 seconds. As a result of that 11 minute difference each year unaccounted for in Gaius Julius' calendar, it had gotten way out of whack, being some several days short of where it ought to be. By the time they can enforce this on all of Europe however, several more years have passed. In 1582 a general adjustment was made throughout Europe with the calendar; several days were just chopped out to make up the shortfall.
Just on general principles, England and her colonies in America did not go along with the adjusted calendar. By the middle 1700's though, this calendar dispute was getting awkward and embarrassing for the British. After all, they still did some trading with Continental Europe and it was getting a bit weary for folks here to say it was Tuesday, May 1 while people on the Continent were saying no, it is really Tuesday, May 12. Not only that, there was a movement afoot to change New Year's Day to January 1. Someone had thought, wisely, from long before that the perfect time to start the New Year was when spring started; a new year, new birth and all that ... so it has always been that March 24 was 'New Years Eve' and March 25 was New Years Day. For example, March 24, 1610 was followed by March 25, 1611 ... it was always done that way for however long. So 170 years after Catholic Europe the British gave in and agreed to adjust the calendars with the new year starting on January 1. September, 1752 saw the new Gregorian calendar take effect in America and England, and the calendar for that month looked like this:
September 1752
Su M Tu W Th F Sa
    1 2 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30
We had to yank 11 days out while Continental Europe only knocked about 8 days out since extra time passed before we finally did it. This, they assured us, plus following the new rules for leap years would keep everything in good shape for quite awhile. The people were a bit unhappy though and riots broke out demanding "Give us back our 11 days".

I no longer have the source of the original document, so cannot provide an attribution for it - If you are the original author please let me know so that I can provide an acknowledgement.


Summary of the 1752 Calendar Change

31 December 1750 was followed by 1 January 1750
24 March 1750 was followed by 25 March 1751
31 December 1751 was followed by 1 January 1752
2 September 1752 was followed by 14 September 1752
31 December 1752 was followed by 1 January 1753


Quakers

Quakers (the Society of Friends) were unhappy using names for days and months which were derived from pagan gods.  They used to quote dates by numbering the day and month e.g. "the fourth day of the third month".  If you are researching Quakers be aware of the confusion that can occur over what is the "third month" - it might not be March , it depends on the date of the event recorded (and perhaps when the event was actually recorded).

Robert Calvert
22 March 2007


back to top

 

If you have any hints and tips to offer please contact us.

To contact Arrodgen click here:      Arrodgen Webmaster

 

homecensusdirectories I members' websites I parish records I photos I resources I stories I hints & tips I useful links I wills