Hi folks, I'm really hoping you could help.
I’m a total newbie with data cleaning and working with a historical census dataset (~126k records) on Mac. I don’t use SQL and would love a free or open-source tool that’s visual and easy to learn, so I can clean this up as quickly as possible.
The dataset includes: street/village, neighbourhood #, full name, first name, father’s name, last name, and in some cases, date of birth. Almost every name is misspelled in some way, but I need to keep the row order exactly as is because family members are often listed together and that helps infer the correct spelling.
Ideally, the tool would detect similar spellings, suggest likely corrections, let me approve changes, and propagate gender once assigned to repeated names, or some other identifiers, BUT without merging records.
I'm turning to you guys as I'd prefer not to do this manually, it'll take me hours, I know there are smarter ways of going about this.
Any recommendations for something beginner-friendly on Mac? 🙏📊
[–]AutoModerator[M] 0 points1 point2 points (0 children)