Smart Address


SMARTaddress allows you to cleanse and standardise addresses within Romania. It works as SaaS (Software as a Service) and you can upload files, cleanse and standardise the address data, and download the results.

To allow for the input and output of addresses containing diacritics, the files must be in a CSV (comma separated values) format and conform to UTF-16. Instructions for doing this can be found under the Help menu.

The cleansing process takes place in five basic steps: pre-processing, parsing, cleansing, geocoding (optional), and reporting. Postcodes are added as part of cleansing.

The cleansing process will be far more successful if the input data has been separated into individual columns or fields (county, locality, street type, street name, premises number). This doesn't have to be perfectly correct, but provided the majority of e.g. localities are in their own field, then the processing will be both quicker and more successful. Users tend to carry out this process using normal database techniques i.e. they split the strings at comma separators, spaces, etc. prior to uploading into SMARTaddress.

The overall cleansing and standardisation process is quite time-consuming. Typical process times for 1,000 records are: ~ 5 minutes per thousand for 'clean' records, and up to ~ 15 minutes per thousand for 'dirty' records. However, once started, it is possible to Logout and then Login sometime later to check progress. The system will only process one job at a time but you can upload files for address cleansing, and download completed files while a cleansing run is in progress.

You are advised to process a sample of, say, 500 records before cleansing a large database. The system has been tested with over three million address records but such a run takes several days of continuous processing time! Typically, users cleanse individual batches of 10,000-to-100,000 records (runs of between 1 and 10 hours). Go to Upload to load a new file, to Status to monitor progress of an existing job, or to Cleanse to run a file which has previously been uploaded.