Since I was not able to convince my boss to let me have all of the personal information of his business, I had to come up with a way to generate data that was at least somewhat realistic.
So I found a site, Melissa.com that has listings of all carrier routes. I used the melissa_address_scraper.py to borrow and transform those data columns into full addresses. From there, I used mimesis, a python library to fake generate the rest of the data. For age and the like, a simple random generator.
From that point, I spent the bulk of my time on the sales side. Trying to simulate some sort of natural buying process and a system that takes previous history into account. I know I could have gone far more into this section, but I just wanted something to test my knowledge of machine learning. Being able to just push a button and simulate an office for months or a year made all of the long hours worth it.
This being my first real project, I wanted to try my hardest to incorporate some sort of test functions. I really wanted to start with the whole TDD cycle, but could not due to me not really knowing what each section was going to do. I realize a much more structured outline before coding would have been a great help to this, and definitely plan to work on my next project with that in mind.