DBFeeder_logo.png SourceForge.net Logo
Contents
  1. Introduction
  2. Basic Components
  3. User Interface
  4. Project Configuration

Introduction

Have you ever tried to create database records by hand? If yes you know very well how difficult and boring is to get records inside the database, even for very simple table structures. A couple of unique keys and few referential integrities make this task already impossible.

This is the reason why we created DBFeeder.

DBFeeder reads your table structures directly from database and it generates automatically test data which fits table constraints. But DBFeeder can even do more. DBFeeder makes use of a powerful data creation mechanism which comes through the Data::Generate module.

This perl module generates synthetic data by parsing given text statements. The central mechanism here is the concept of an inverted regular expression:

Usually a regular expression (regex) acts as a pattern to filter rows in a record set. Data-Generate uses regex's inversely: the pattern acts as a data creation rule to create a matching record set.

Data-generate extends this concept behind regex expression syntax to handle special data types like dates or floats and to let users control record value distributions. In other words, this module provides a kind of data creation language to let you fine tune the kind of generated data.

DBFeeder integrates Data-Generate in a column based approach:DBFeeder associates each column of each table in user's schema with a data creation rule. DBFeeder reads these rules from several configuration files: per table column there are configuration files containing corresponding data creation rules (read more on corresponding section).

You cannot however tune directly all data creation process. Foreign key matching for example, is achieved by directly copying parent column's data into the child column structure, so here to customize child column's behaviour you have to manipulate the parent column.

Table cardinality is another diffcult point to control. Why? Imagine that you have a table with a DECIMAL(8) primary key column. Theoretically you can store there millions of records. However let's say that you want only positive numbers starting from 1 and having three 0's at the end. With this choice you have the solution space of this column decreasing dramatically from 10^8 to 10^4 (10'000) possible values. Foreign and unique key constraints have even a bigger influence on cardinality, so in order to know cardinality of current table you have to know cardinality (and data) of parent tables. That's why table cardinality tuning works basically interactively in DBFeeder (read more on corresponding section).

In conclusion, data creation is a difficult, sometimes very difficult job, but we hope that this becomes now a lot easier with DBFeeder. Enjoy using DBFeeder.

The DBFeeder Team.


Basic Components

DBFeeder works with following components:

  1. User interface: When you work with DBFeeder you work most of the time with the Text UI (User Interface), altough some actions (for example changing of column configuration files) have still to be performed by hand. This interface is menu based.
  2. Projects: before starting data generation, users need to create a DBFeeder project. DBFeeder associates user's schema as well as a project root directory to the project.
  3. System directory tree: DBFeeder creates this directory tree at project's creation. The tree is located just one level below project's root directory.Please do not manipulate data in this directory: DBFeeder stores here internal information about schemas.
  4. Input directory tree: this directory tree contains all column configuration files. DBFeeder creates all these files at projects creation and stores there default startup data creation rules. The tree, which is located just one level below project's root directory, follows database structure so that you can easily find specific column configuration files. This is the place where users can perform data tuning.
  5. Output directory: this directory,located at the same place as input and system directory, contains the generated sql scripts that you can use to populate your database with data. DBFeeder overwrites these files each time that you generate data.

User Interface

With the Text UI you can steer most of DBFeeder functions. This interface is menu based with a keypress navigation system: for each keypress action there is a corresponding textual description. While numbers are reserved to trigger specific actions, the letters A,B and optionally the letters N,P,F,L have always the same function and can be used to navigate into the menu pages. See the example screenshot below.

Manual_ss_01.png

The text menu has mostly predifined submenus, except for menus which works with project lists like for exampe the 'LOAD PROJECT' menu. Here below the menu tree structure:

  1. MAIN MENU
    1. DATA MENU. With this menu you can generate testdata. Please load a project before switching to this menu.
      1. generate sql scripts Choose this menu item to generate SQL scripts in the output directory. DBFeedeer generates three main sqlplus scripts: an 'insert tables' script; a 'delete tables' script and a 'count records' script.Previously existing files will be just overwritten.
      2. run sql cleanup script: With this menu item you run a corresponding 'delete tables' sqlplus script which deletes all table contents.
      3. run sql data count script: With this menu item you run a corresponding 'count table records' script.
      4. run sql insert script: With this menu item you run the corresponding 'insert table records' script which effectively populates database tables.
    2. GLOBAL SETUP MENU: with this submenu you can access global DBFeeder configuration settings.
      1. set ANSI terminal. Switch terminal to ANSI Mode (default: OFF)
      2. set config file. Path to the configuration file for global settings. This menu point is just informative. You cannot change config file at runtime, Start DBFeeder with cl option -c config-file instead.
      3. set cutoff fk merge card. This is a very special parameter setting. Please do not change the default value unless you know what you want. (Default 10)
      4. set dbi driver. Give here your DBI driver (Default is dbi:Oracle). Tested drivers at the moment are dbi:Oracle and dbi:ODBC.
      5. set demo directory (default: demos). Change here this value if you move DBFeeder demos directory somewhere else than the DBFeeder installation path.
      6. set maximal table cardinality(default 10000). This is an overall maximum of rows produced. Adapt this value to your needs. However keep in mind that memory consumption increases accordingly to to number of rows produced.
      7. set predefined default rules( default: repos/Defaults/Oracle). Change here this value if you move DBFeeder repos directory somewhere else than the DBFeeder installation path.
      8. set project list file(default: Project_list.dat) Change here this value if you keep the project list file somewhere else than in the DBFeeder installation path
      9. reset default values. Reset all global parameters to original defaults.
      10. set sqlplus(:=sqlplus).Please change this vaule to your effective sqlplus path in case the command is not visible in your standard environment.
      11. set version(:=0.9)This menu point is just informative. You cannot change the value here.
    3. PROJECT MENU: with this submenu you can administer your projects.
      1. CREATE DEMO PROJECTS: choose this submenu to install demo projects in your database. See the tutorial for details of this menu section.
        1. CREATE PROJECT HR_DEMO: installs a demo schema based on Oracle HR_DEMO sample.
        2. CREATE PROJECT TEST_FK_DEMO: installs a technical demo schema to demonstrate accuracy of DBFeeder's foreign key algorithm.
      2. CREATE NEW PROJECT: choose this menu item to setup project variables and create a new project.
        1. set DB INSTANCE: give here the value of the DB instance (e.g. ORACLE SID).
        2. set DB PASSWORD: password for DB account.
        3. set DB USER: user name for DB account.
        4. set PROJECT FOLDER: project root directory location.
        5. TEST DB CONNECTION: choose this option to check if your connection settings are working.
        6. MAKE PROJECT: choose this option to let DBFeeder examine your database schema and to let it create the corresponding files below project's root directory.
      3. DELETE PROJECT: choose this submenu to delete existing projects from current project list. Pay attention: both the project entry in the DBFeeder project list as well as project directory will be deleted!
      4. LOAD PROJECT: choose this submenu to load an existing DBFeeder project from current project list in memory. You must load a project before you can generate test data.
      5. REBUILD PROJECT: choose this submenu to rebuild an existing DBFeeder project from current project list. Rebuilding a project is a quite interesting feature: in this way DBFeeder recreates all configuration files below project's root directory according to current database structure. Altough at the moment column's configuration files are overwritten, we plan for the next release a merging functionality which will preserve old configuration files.

Project Configuration

DBFeeder configuration system is based on the powerful Data::Generate module: this module lets you control the generated data by defining corresponding data creation rules. In this way you can generate data for various datatypes, like in the examples below:

  1. Create varchar data with numeric and alphanumeric values. Defining VC(10) [14][2579]{4} (25%) | [A-G]{2}[X-Z][QN] (75%) gives you about 1/4 of the data like 12222,15222, ... and 3/4 of the data ike AAXQ,BAXQ,...
  2. Create a list of integers. Defining INT (9) +/- 0 [0,3]{2} gives you the possible values 0,3,30,33,-3,-30-33
  3. Simulate datetime values (it could be for example job scheduling data). Defining DATE '1999' 'nov' [07,thu-fri] '09' : '09' : '09' ; gives you a set of date values corresponding to the 7th and all thursdays and fridays of the month of november 1999.

DBFeeder reads data generation rules on the input directory tree, which is located just one level below project's root directory. Two different files are associated to each table column:

  1. A column configuration file which is specific for a particular column of a particular table.
  2. A default configuration file which is associated to all equally named (and equally typed) columns.

These two files are in a hierarchical relationship. You can merge default configuration rules into column configuration rules by using the keyword _SCHEMA_DEFAULT_ in column configuration rules. In fact, to ensure that default rules are 'per default' propagating, DBFeeder creates initial column configuration files of such kind:

STRING [A-Z] {2} (50%) | _SCHEMA_DEFAULT_ (50.0%)

That means that in half of the cases the rule '[A-Z]{2}' will be applied while in the other half of the cases the value contained in the corresponding default configuration file is taken.

While column configuration files are located under the correspoinding directory tree at 'column level', default configuration files are located at 'schema level'. For example, let assume that you are looking for configuration files of column COUNTRY_NAME of the HR_DEMO demo project (table COUNTRIES), which is stored in the database MACDB01 and is installed on the directory tree /Developer/Projects/V1/Data/HR_DEMO...

  1. ...then you would get the default configuration file in /Developer/Projects/V1/Data/HR_DEMO/Input/MACDB01/HR_DEMO/Defaults/COUNTRY_NAME8.CFG (the number here is a DBFeeder internal column indexing)
  2. ...while the column configuration file would be stored in /Developer/Projects/V1/Data/HR_DEMO/Input/MACDB01/HR_DEMO/COUNTRIES/COUNTRY_NAME/COUNTRY_NAME.cfg

Thanks to this defaulting system you can often just work with default files and practically skip the configuration at column level. In this way number of files to be edited and configured gets greatly reduced.