Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

January 10, 2014

…Customizable Test Data with Python

Filed under: Data,Python — Patrick Durusau @ 5:15 pm

A Tool to Generate Customizable Test Data with Python by Alec Noller.

From the post:

Sometimes you need a dataset to run some tests – just a bunch of data, anything – and it can be unexpectedly difficult to find something that works. There are some useful and readily-available options out there; for example, Matthew Dubins has worked with the Enron email dataset and a complete list of 9/11 victims.

However, if you have more specific needs, particularly when it comes to format and fitting within the structure of a database, and you want to customize your dataset to test one thing or another in particular, take a look at this Python package called python-testdata used to generate customizable test data. It can be set up to generate names in various forms, companies, addresses, emails, and more. The Github also includes some help to get started, as well as examples for use cases.

I hesitated when I first saw this given the overabundance of free data.

But then with “free” data, if it is large enough, you will have to rely on sampling to gauge the performance of software.

Introducing the hazards and dangers of strange data may not be acceptable in all cases.

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress