Disclaimer: This dissertation has been written by a student and is not an example of our professional work, which you can see examples of here.

Any opinions, findings, conclusions, or recommendations expressed in this dissertation are those of the authors and do not necessarily reflect the views of UKDiss.com.

Data Analysis and Design: Approaches to File Processing Systems and Data Mining

Info: 9133 words (37 pages) Dissertation
Published: 9th Dec 2019

Reference this

Tagged: Information SystemsBusiness Analysis

Database Management Systems (DBMS)
1.1. Compare and contrast traditional file processing system with database environment
Contents Table

Improves decision making

More control of data redundancy

Consistency

Data sharing

Improved data integrity

Improvement in security

It enforces standards

Scale of economy

Data quality

Improvement in responsiveness and accessibility

Analysing the different approaches to data design

Conceptual data model

Enterprise data modelling

Logical data modelling

Logical modelling

A flat database

Hierarchal Model

The Relational Model

DATA MINING 1.2

Data warehousing

2.1

Normalisation

UNF

1NF

2NF

3NF

2.2

ENTITY RELATIONSHIP DIAGRAM

3.1

CUSTOMERS TABLE

ORDERS TABLE

DVD TABLE

Orderline table

Delivery table

CREATE TABLE payments

INSERTING VALUES INTO TABLES

CUSTOMER

ORDER TABLE

DVD TABLE

ORDERLINE TABLE

DELIVERY TABLE

PAYMENTS TABLE

Screenshot of customers table

Screenshot order table

DVD table screenshot

Screenshot orderline

Delivery table screenshot

QUESTION 4.1

Query A

show all dvd in store

Query B

Select all from customer

Query C

Query D

Query E

selecting customers with orders over £50

Query  F

Query G

select a particular customer

QUERY H

Why Test an RDBMS?

What Should We Test?

3. When Should We Test?

Writing Database tests

Setting up Database Tests

What Testing Tools Are Available?

Why we test a database

Testing processes and types

White box testing

Black box testing

 TASK-1

1.1.

 Compare traditional file processing system with database environment:

In the very early days there was no database computers because computers were not as powerful as the ones we have today.  In the early days, the powerful computers were mainly used for engineering and scientific calculations. They were slowly introduced in the business world mainly for business applications.  computers should be able to manipulate, store as well as retrieve large data files, and that is the reason why computer file processing system were created for this purpose.  Over time they have evolved, their basic purpose and structure have slightly changed.

Due to the more complex nature of business applications, the old traditional file processing had its flaws and limitations which has resulted in them being replaced by the database processing systems in most of the business applications nowadays

The old filing system are still used mainly

  •  for backing up database systems
  • To understand limitations and faults that arise in the old system in order to avoid these problems when designing new databases.

There are numerous disadvantages of the old filing system which are listed below.

Duplication of data- Applications are independently developed in the old file processing system which could result in unplanned duplication of such data which results in loss of the integrity of the data because data format might be relatively inconsistent or the data might not agree.  This could also lead to loss of metadata integrity.

Program data dependency- File descriptions are stored in each application program which accesses a particular file e.g. an invoicing system and the customers master file, which gives a full detailed description for both files, as a result if any changes have to be made to all programs that access these files and sometimes difficulties arise as it changes will not cover all programs that access file. Errors can also be made when making changes

Limited data sharing- with the old filing processing approach users have limited opportunities in sharing data outside of their own applications as each application has its privately-owned files.  It causes frustration when a requested report requires major effort in obtaining from numerous incompatible files from separate systems.  It could also require that other different organisations own these other files making it difficult to obtain.

Long development times- the old filing system offers little opportunity for leverage for previous development efforts.  The developer must create a new application from scratch designing the new files access logic for each new program created.  The longevity of creating the new program are quite inconsistent with the fast-moving business environment.

Excessive program maintenance- All other factors combine to make it a difficult program to maintain for companies that rely on this filing system.  a high rate percentage are allocated which leaves limited budget for creating new applications.

All the disadvantages listed are the limitations of databases if organisations do not apply the approach properly.

The advantages of the old filing system

There is no need for a highly skilled technician to handle the database and there is no external storage required on external platforms which are used to control and regulate data

Database Management Systems (DBMS):

In the early 1960’s an improvement was made upon the old filing system and thus, replaced with the Database Management System. This can be a small or very large database that holds lots of data.

This approach bases its emphasis on the integration and sharing of data throughout the organisation.  This requires a fundamental orientation or change in thought processes which should start from top management

The DBMS eradicated the problem of manually seeking data.  A solid structure was put in place to make it much easier to locate and retrieve data from relevant databases instead of going through repetitive files that are created on the old filing system.  A more logical view of imputing data.

A database is a collection of information which is arranged in order for it to be accessed when required by a user.  Data can be in various forms numeric, images, text and more. Which are stored in tables.  The basic idea of DBMS is for data to be extracted in a fast-reliable way when it is required for use.

DBMS creates a relationship with its high-level entities of the company its association within these entities.

Example:

  1. Customers
  2. Order
  3. Order line
  4. Product

Customers are required to order products from a border line and thus, a relationship is formed. In DBMS, it is required that entity relationship (ER) diagrams are used as they are used to show relationships of how the database would be modelled.

Diagram of a DBMS

The DBMS approach can offer much better advantages compared to the old filing system, only few can realise if the company implements many databases standing alone, if this is done it would be like the old filing system which would demonstrate the old filing system’s disadvantages.

The advantages are as following:

Improves decision making:

This Improves decision making and intracompany communications by installing a web based internet which is only accessible with the company or organisation.  Employee’s workstation can be used in accessing information swiftly e.g.  Phone directories, electronic mail and various other information.  Workstation can be used to access various databases through the internet

More control of data redundancy:

Redundancies are not completely removed from the DBMS approach but it can control redundancies in the actual database.

Consistency:

As redundancies are less in the database it makes the database more consistent and effective to the users as they will not input an entry and retrieve the wrong information required.

Data sharing:

DBMS allows all users within an organisation to share information.  only authorised users will be allowed to access information to certain aspects of data.

Improved data integrity:

This is the provision of consistency validity of stored data on the database. Constraints are normally expressed as integrity; these are consistent rules that the database cannot violate

Improvement in security:

DBMS protects data from unauthorised users. Passwords are required in order to users as well as restrict users to sensitive information that might not be required for that department, which can be implemented in operations such as retrieving, updating, inserting and deleting of the data.

It enforces standards:

The DBMS sets standards that have to be met while integrating the database from data formats, documentation standards, naming conventions, access rules and updating procedures

Scale of economy:

Savings are made when all of the companies’ operational data combined into one database with applications working on one data source.

Data quality:

Many tools and processes are provided to enhance quality of data constraints are created, and cannot be violated, as they are a rule of the database approach

Improvement in responsiveness and accessibility:

It is relatively easy for a user without programming experience to display and retrieve data even by crossing departmental boundaries

Analysing the different approaches to data design:

Data modelling can be described as the process of documenting and formalising of existing events and processes that arise during software application design and development.  The tools and techniques translate and capture complicated development and design which is translated into easily understood representation data processes and its flow, thereby creating a blueprint for its construction or updating previous versions.

A data model can be imagined as a flowchart or diagram which illustrates relationships between different forms of data.  Capturing all the possible relationships can be quite intensive but should be done with care to portray a well-documented model which allows for change when errors are sighted before coding is done.

Multiple models are viewed by the architecture to ensure that all entities, processes dataflow and relationships are identified

A database schema can be described as the structural skeleton, representing the logical view of the whole data base. Data is organised, defined and showing how relationships are associated, thereby showing all constraints that are to be applied on data.

The schema identifies its relationships and entities between them. A descriptive detail of the database is contained which can be illustrated by schema diagrams.

Schemas can be divided into two main categories

A database schema can be divided broadly into three categories

Conceptual data model:

This model identifies relationships at the highest level between its different entities.

Conceptual data modelling

 Enterprise data modelling:

This quite similar to the conceptual model but it bases it emphasis on the unique business requirement.

Enterprise data modeling

Logical data modelling:

This stands for the basis of creating the physical data model.  it outlines the specific attributes, relationships and entities for a business function.

Logical modelling

There are four main data models:

A flat database:

This the simplest database system created.  it represents data in single tables where all records are stored in single rows, delimiters, such as commas and tabs, are used in the separating rows of data.  Representation of tables are a simple text file.

Network database

This allows multiple records linked to same file.  this model can be described as a tree turned upside down whereby the branches are member information linked to the owner

Hierarchal Model:

In this model databases consist of number of records which are connected by links.  it is quite like the network model reach record has many fields or attributes of which a data value is contained. The link shows association between records

Advantages of this model are easy to understand and navigation through records is relatively fast. The drawback is that it does not allow for changes

The Relational Model:

The basic concept of a relational model are the tables.  Fields or columns are used in identifying all attributes

All data is view in the form of tables with each column representing an attribute.  Importance is placed on the relationships between entities by the values stored in each column of corresponding tables.

Each table must contain a unique identifying key which could be used in relating to other tables.

One main advantage is that it allows for changes to be made anywhere, unlike the Hierarchal Model

1.2

Data Mining:

The computerised process of investigating and analysing large amounts of data to extract the trends or patterns of the data.  These trends are monitored by data mining tools, to predict behaviours of future trends.  These allows the businesses to make important business decisions.  The data mining tool has resolved the problem of earlier years that were too time consuming rectify.  The mining tool observes the patterns and trend.  it also seeks out predictive information that experts might miss.

What can data mining do for our small DVD business are numerous other small businesses have taken up this concept. Tools are in place to gather up historic data using pattern recognition technology, mathematical and statistical methods to go through warehoused information.  it would help the company create relationships with what their customers would want to buy, the bestselling payments, create a safe environment for customers to purchase their DVD films, by being able to create trends on their database by obtaining more useful information about their customers.

The DVD business could have better decision making, all the patterns and trends analysed can help spot trends in sales which would result in smarter marketing of the DVD’s, it also show loyalty of customers

Other uses of data mining are:

  • Market segmentation it would identifythe common features of customers who regularly buy the same DVD’s from the company.
  • The company would be able to predict the mostly likely customer to deal with another competitor
  • They would also be able to identify a lot more fraudulent transaction based on the analysis of the data mining tool
  • Direct marketing would be identified by the inclusion in a mailing list to obtain the highest response rate.
  • They would be able to predict individual’s interests (being able to know what the customers would prefer to see on their websites
  • Also, would be able to evaluate the trends from month to month

Automated predictions of behaviour and trends it automates processes of locating predictive information, hands on analysis can be answered from the data, the use of promotional past mailing can be used to maximise the most likely return on investments in future mailing

Data Warehousing:

Illustrates the capturing of data from different sources that are used for useful access and analysis which don’t always start from the point of view of end user that may require access to specialised, occasionally local databases.  The latter statements are known as Data Mart

Data warehousing has two different approach which are:

  • Top down approach
  • Bottoms up approach

The top down approach is taken off data mart for specific groups of users after the complete data warehouse which has been created.

In the bottoms up approach firstly builds up the data mart then combining them into a single data warehouse.

A mainframe server or cloud (the storage technology used today) houses a data warehouse.  Data from numerous transactions online that process (OLTP) and other various sources is extracted to use by analytical applications and the user’s queries

2.1

 Normalisation:

This is the process of organising data in a database, this is done by creating tables which have relationships with the other tables which is done in order not to eliminate inconsistent dependency and redundancies.

Standardisation includes decaying a table into less excess (and littler) tables without losing data; characterising remote keys in the old table referencing the essential keys of the new ones. The goal is to separate information so that augmentations, cancellations, and adjustments of a trait can be made in only one table and afterward engendered through whatever remains of the database utilizing the characterized outside keys.

UNF:

Un-normalised form is a preliminary phase of the standardisation process permitting us to

Make an organized edge, illustrative of a bit of hierarchical information, for example, a

Shape or archive (e.g. receipt, report, buy request and so forth.). This is our underlying

Normalisation “connection” that contains both genuine information, taken from the structure or record,

Also, demonstrated information, based upon and reached out from the first from or report.

Right now, the un-standardized connection is only a major muddle of information however this preliminary

Stage is the most vital. As every phase of the normalisation procedure is needy

Upon the past, it is key for this, as the beginning stage, is set up with the privilege

Areas and information to guarantee a smooth move between the stages

For the case study, the un-normal form attributes are listed

ORDER (order no, order_ date, customer ID, customer _name, customer_ address, customer_ telephone, customer_ email, customer_ city customer _postcode dvd _id, dvd _title, unit _price, order _quantity, order_ total, delivery _date, delivery _address, debit _card, card _no, expiry _date, CVC_no, order _quantity, genre, directors, certification, dvd _title, subtotal)

1NF:

A database is in first normal form on the off chance that it fulfils the accompanying conditions:

Contains just nuclear qualities

There are no repeating attributes

ORDER (order no, order date, order total, customer ID, customer name, customer address, customer telephone, customer email, city. Customer postcode, debit card, card no, expiry date, CVC_no, delivery- date, delivery _address,)
ORDER-LINE (order no, dvd no, dvd title, unit price, order quantity, line total, genre, certification, directors, actors, titles, subtotal)

2NF:

A database is in second normal form on the off chance that it fulfils the accompanying conditions:

  • It is in first normal form
  • All non-key properties are completely utilitarian reliant on the essential key

In a table, if property B is practically reliant on A, however subject to an appropriate subset of A, then B is not practically being considered completely utilitarian subject to A. Consequently, in a 2NF table, all non-key traits can’t be reliant on a subset of the essential key. Note that if the essential key is not a composite key, all non-key traits are dependably completely practical subject to the essential key. A table that is in first typical frame and contains just a solitary key as the essential key is consequently in second ordinary structure.

ORDER (order no, order date, order, total, customer ID,)
ORDER-LINE (order no, dvd id, dvd title, unit price, order quantity, order total, order date, genre, title, directors, actors, certification, subtotal)
CUSTOMER (customer-ID, customer name, customer address, customer telephone, customer city, customer postcode, card no, expiry date, CVC_no,, debit_card, customer email )
DELIVERY (Delivery _no customer-ID, delivery date, customer_ address)

3NF:

A database is in third normal form on the off chance that it fulfills the accompanying conditions:

  • It is in second ordinary structure
  • There is no transitive practical reliance
ORDER (order no, customerID, order date, employee_ id)
CUSTOMER (customersID, customer name, customer address, customer telephone, customer postcode, customer city, customer email)
PAYMENTS (payment id, customer ID, debit card, card no, expiry date, CVC_no, order no, employee_ID,payment_colected )
DELIVERY (delivery no, customer-ID, delivery date, customer_ address, order no,)
ORDERLINE (orderline no, order no, DVD id, order quantity, unit price, subtotal,)

 

DVD ( dvd id, genre, certification, directors, DVD title, actors, DVD_ qty, unit price, order_quantity,,subtotal, grand total)

2.2

ENTITY RELATIONSHIP DIAGRAM:

An element relationship outline (ERD) is a graphical representation of a data framework that demonstrates the relationship between individuals, objects, spots, ideas or occasions inside of that framework. An ERD is an information demonstrating method that can characterize business forms and can be utilized as the establishment for a social database.

The diagram below illustrates the relationships between all entities of the DVD shop for the case study

3.1

CUSTOMERS TABLE:

CREATE table customer

(

customer ID int, Primary key

Last Name varchar (50),

First Name varchar (50),

Address varchar (100),

Telephone varchar (100),

Postcode varchar (20),

Email varchar (100),

City varchar (80)

)

ORDERS TABLE:

CREATE TABLE orders

(

order number char (100) PRIMARY KEY,

customer ID int (100),

order Date char (150),

order Total int(100)

);

DVD TABLE:

CCREATE table customer

(

customer ID int, Primary key

Last Name varchar (50),

First Name varchar (50),

Address varchar (100),

Telephone varchar (100),

Postcode varchar (20),

Email varchar (100),

City varchar (80)

)

Order Line Table:

CREATE table customer

(

customer ID int, Primary key

Last Name varchar (50),

First Name varchar (50),

Address varchar (100),

Telephone varchar (100),

Postcode varchar (20),

Email varchar (100),

City varchar (80)

)

DELIVERY TABLE:

CREATE TABLE delivery

CREATE TABLE delivery

(

delivery no int PRIMARY KEY,

customer ID int,

delivery date,

Address varchar (100),

order number char (100)

);

CREATE TABLE PAYMENTS:

(

payment_id int PRIMARY KEY,

customerID int,

Debit_card varchar (100),

Card_no  int ,

Expiry_date int,

CVC_no int

);

Inserting Values into Tables:

Customer:

INSERT INTO customer (customerID, LastName, FirstName, Address, Telephone, Postcode, Email, City)

Values

(10001, “Hamilton”,”Gregg”, “37 st julians road”, “07403777231”, “NW6 7LB”, “gregola2707@gmail.com“, “London”),

(10002, “Demideh”, “Gregory”, “126 Sapphire court warton road”, “0208535234”, “E15 2GG”, “geeman12@hotmail.com”,”London“),

(10003, “Mardle”, “Gregory”, “48 Woodstock road”, “02076043328”, “W12 8LG”, “bmardle6@yahoo.co.uk“, “London”),

(10004, “Best”, “Tiyana”, “55 base road”, “01215659880”, “B12 7JJ”, “tbest@aol.com“,

“London”),

(10005, “umaru”, “Dodo”, “56 Liverpool road”, “01617990122”, “L2 7FF”, “udodo@hotmail.com“,

“London”),

(10006, “James”, “Milner”, “45 new road”, “07579001233”, “W10 9TT”, jamilner@gmail.com”,”London);

Oder Table:

INSERT INTO orders (order number, customer, order Date, order Total)

values (0003,10001,”2016-03-30″,25),

(0004, 10003, “2016-03-14”, 15),

(0005, 10002, “2016-03-27”, 14),

(0001, 10004, “2016-03-24”, 17),

(0002, 10005, “2016-03-01”, 34),

(004, 10003, “2015-05-02”,5),

(003, 10002, “2015-05-10, 10),

(001, 10004, “2015-05-15, 10),

(002, 10005, “2015-05-20, 5),

(005, 10003, “2015-05-23, 12),

(005, 10003, “2015-05-24, 10),

(001, 10001, “2015-05-28, 15),

(016, 10004, “2015-12-15”, 10),

(017,10001, “2015-12-19” 12);

 DVD TABLE:

INSERT INTO dvd

(dvdid,title, director, genre, cert)

VALUES

(06, “ride along”, “gregg young”,”action”,”18″),

(02, “terminator”, “adrian skuza”, “action”, “15”),

(03, “the rock”, “tony benson”, “comedy”, “pg”),

(04, “beast of no nation”, “tony blair”, “thriller”,”12″),

(05, “something about mary”,”janet black”, “romance”, 15)

ORDERLINE TABLE:

INSERT INTO orderline

(orderline _no, order _number, dvd id, order _quantity, unit _price, sub-_total)

VALUES

(001, 1, 06, 7, 10, 70),

(002, 1, 04, 5, 10, 50),

(003, 2, 05, 6, 12, 72),

(004, 2, 03, 4, 8, 32),

(005, 3, 02, 2, 6, 6)

Delivery Table:

INSERT INTO delivery

(delivery no, customer, delivery date, Address, order number)

VALUES

(010, 10003, 2016-03-14, “48 Woodstock road”, 0004),

(020, 10001, 2016-03-30, “37 St Julian’s road”, 0003),

(030, 10002, 2016-03-27, “126 Sapphire court Warton road”,0005),

(040, 10005, 2016-03-01, “56 Liverpool road”,0002),

(050, 10004, 2016-03-24, “55 Base road”, 0001)

Payments Tables:

INSERT INTO payments (payment, customer ID, Debit card, Card _no, Expiry date, CVC _no)

VALUES

(2001, 10003, “yes”, 4711001034430933, 0120, 101),

(2002, 10002, “yes”, 5433101198008888, 0617, 904),

(2003, 10001, “yes”, 4500231483204751, 1118, 704),

(2004, 10004, “yes”, 4508088647210233, 1218, 002),

(2005, 10005, “yes”, 5322908690321454, 0721, 0576)

Screenshot of customers table:

Screenshot order table:

 DVD table screenshot:

Screenshot order line:

Delivery table screenshot:

QUESTION 4.1

QUERY A

 SHOW ALL DVD IN STORE:

SELECT * FROM [dvd]

QUERY B

SELECT ALL FROM CUSTOMER:

SELECT * FROM [customer]

QUERY C:

Note: I have used different other names on my database and selections made revolve around my names in this query Umar was used

SELECT Last name, first name, title

FROM customer, DVD, orders, order line

WHERE

last name = “Umar”

AND customer ID = orders customer ID

AND orders order number = order line order number

AND dvd id = dvd _id

The output from query

QUERY D:

SELECT Last name, first name, title, order Date

FROM customer, DVD, orders, order line

WHERE

order Date BETWEEN ‘2015-05-01’ AND ‘2015-05-29’

AND orders. customer ID = customer, customer ID

AND orders, order _number = order line. order _number

AND dvd id = dvd_id

QUERY E:

selecting customers with orders over £50

SELECT Last name, first name, subtotal

FROM customer, DVD, orders, order line

WHERE

subtotal > 50

AND order Date BETWEEN ‘2015-05-01’ AND ‘2015-05-30’

AND orders.customerID = customer customerID

AND orders.order_number = orderline.order_number

AND orderline.dvdid = dvd.dvdid

QUERY F:

DVD total sold in        March 2015 (NB used March and may for while creating tables)

SELECT sum (order quantity) AS “DVDs sold in March”

from order line

QUERY G:

 SELECT A PARTICULAR CUSTOMER:

SELECT Last name, first name, title, order Date

FROM customer, DVD, orders, order line

WHERE

last name = “Demideh”

AND order Date BETWEEN ‘2015-05-01’ AND ‘2015-05-30’

AND orders.customerID = customer.customerID

AND orders.order_number = orderline.order_number

AND orderline.dvdid = dvd.dvdid

QUERY H:

UPDATE customer

SET city=”Luton”, postcode=”LU1 9LL”, Address = “47 elm road”

WHERE Last Name=”James”;

WHY TEST AN RDBMS?

The purpose of testing is to ensure there are no faults or errors, and eliminating such errors which would improve quality of the data base

Security protects data from accidental or intentional destruction or misuse by controlled access to such data.

Database security concerns the utilisation of a wide scope of data security controls to shield databases from bargains of their respectability accessibility. Also, classification, ”

Unauthorised or abuse by approved database use

Malware infections

Creates harm to database servers brought on by PC room floods or fires

Corruption of data and its losses caused by the entry of invalid data or commands

Databases, the accumulation of interconnected records on a server, storing data, may not manage the same sort of information, i.e. databases might be heterogeneous.

This resulting in numerous types of usage and combination blunders may happen in substantial database frameworks, which adversely influence the framework’s execution, unwavering quality, consistency and security.

Along these lines, it is vital to test with a specific end goal to get a database framework steady.

There are various reasons why you need to create a proper testing strategy for any RDBMS:

  • Data is delicate and very iimportant business asset.
  • Critical business functionality being usedCurrent approaches aren’t sufficient.
  • Testing provides the concrete feedback required to identify defects.
  • Supporting evolutionary development.

Databases, the accumulation of interconnected records on a server, storing data, may not manage the same sort of information, i.e. databases might be heterogeneous.

This resulting in numerous types of usage and combination blunders may happen in substantial database frameworks, which adversely influence the framework’s execution, unwavering quality, consistency and security.

Thus, it is vital to test with a specific end goal to get a database framework to be consistent

Database testing undergoes four stages

  • Fixture setting
  • Test run
  • Verification outcome
  • Tear down

WHAT SHOULD WE TEST?

Figure A shows what you should be considered for testing with relational databases. The diagram is created from the point of view for a single database, the red dashed lines shows the boundaries of threats, showing that you need to consider threats both within the interface to the database and the database itself (clear box testing) and at the (black box testing). Table A lists the problems which you should consider testing for both internally within the database and at the interface.

Figure A. What to test.

Table 1. What we test in an RDBMS.

Black-Box Testing at the Interface White/Clear-Box Testing Internally Within the Database
  • O/R mappings (including the meta data)
  • Incoming data values
  • Outgoing data values (from queries, stored functions, views …)
  • Scaffolding code (e.g. triggers or updating views) which support refactoring
  • Unit tests for your stored protocols, functionalities, and triggers
  • Existence tests for database schema elements (tables, procedures, …)
  •  Definitions views
  • Referential integrity (RI) rules
  • Default values are used for each column one single column for data invariants
  • Data invariants which involves several columns

3. WHEN SHOULD WE TEST?

Most Agile software developers use the test-first approach for development whereby they write tests before writing enough production codes to fulfil test. The first task is to add a test quickly basically enough coding to fail. Next run tests, most times the complete test suites may decide to run only a subset, to make sure that the new test fail. You then update functional coding to ensure it passes the new tests. The fourth step is to run tests all over again. If result is a failure, update functional coding and retest. Once the tests pass, start over.as a next step.

Figure B the Process of Test First Development (TFD).

Test-driven development (TDD)

 

This an evolutionary method to development which does combine test-first refactoring and development.  When a software developer tries to implement new features, the first thing they question is if the design would enable them to add on the new feature if yes add it on, if no they refactor design to make adding it on possible then carry on with the TFD approach.  This approach is useful developing database schema and application coding.

A number of automated tools have been created by developers to run automated tests rather than carrying out manual tests over and over.  These tools make it possible to automate database testing scripts into the build procedure

 Writing Database tests:

Tests are written like you would for any other sort of test.  There a three-step process

  • Setting up the test: database must be in a known state before test can be run
  • Run the test the test must be run just like that of application tests
  • Checking results must be able to do table dumps to get the current values in database which are then compared against the expected results

 Setting up Database Tests:

The exact state of the database must be first known to successfully test your database the best way of doing this is by putting database in a known state there are two ways for carrying this out:

  • Starting afresh: rebuilding the database is a common practise it involves creation of the schema and the loading of test data for every major testing
  • Initialisation of data testers may consider forgetting the rebuilding and dropping of the databases in favour of reinitiating the data source, which can be either by erasing all existing data and inserting the data values back in the database or run updates to reset the values of data.  the first approach is less risky than the latter and could be considered faster for databases that holds large amounts of data

The creation of test data is an important part writing database tests and there are various methods of doing this

  • Having source testing data maintenance of an external definition of the test data, maybe in flat files, secondary set of tables and XML.  This data would then be loaded from external sources
  • Test data creation scripts maintain and developing scripts, the use of data manipulation language(DML), simply application source code or SQL which carries out the insertions, deletions and updates that are required to create test data
  • Self- contained test cases any individual tests put the database state which is a known state for testing

The test approaches used to create test data can be used in combinations or used alone.  An advantage of writing self- contained and creation test cases is that more likely that developers of that code might place it under configuration management (CM) control although it’s possible to put the test under CM control. An approach that suits the organisation should be adopted

Test data comes from unit testing.

What Testing Tools Are Available?

There are some good tools available that can be used on open source programming and other business devices that can be accessed which can enhance database testing.

.

Table 2. Some database testing tools.

Cat Desc Examples
Data Privacy Tools It gives Data privacy, which is a serious issue for many large organisations. By law organisations are meant to safeguard data.
Load testing tools These tools are for high usage database types will test system architecture
  • RadView
  • Mercury Interactive
  • Empirix
  • Web Performance
Test Data Generator Data needs to be tested against which to use to validate their system this method is more useful for stress and load testing

Turbo Data

Test Data Management  test data needs to be managed. It should also be defined, automated or manually or both and then maintained under version control. Define expected results of tests and compares them with the actual result.
Unit testing tools Database regression testing tools.

Why we test a database:

Data mapping

Data often travels back and forth from the (GUI) the Graphical User Interface to the back

end of the database to:

  • To cross check all fields in the GUI front end are mapped properly with the database tables
  • Testers would have to ensure right action was taken and the action taken is also successful.
  • CRUD Create Retrieve Update and Delete, at the backend gets invoked which corresponds with certain actions performed in the front end

ACID properties validation

ACID properties validation which stands for Atomicity Consistency Isolation and Durability, are the rules by which every database must stick by

Data Integrity

Any operations carried out, are updated and the much earlier values of shared data appear on all screens and forms.

If a value is updated on one screen it should, thus, be updated on all screens.
Process of database testing

The database has a general rule for testing which is also similar for other applications

Stage 1 prepare environment for testing

Stage 2 Run test

Stage 3 Check the results of the test

Stage 4 then validate with the expected result

Stage 5 Show the report of the findings from the test

Testing processes and types:

White box testing

Black box testing

White box testing:

This type of testing validates data models, database tables and database schemas.  it also performs SQL queries to justify relationships and repetition, referential integrity rules are checked.  Coding errors are detected which can eliminate bugs in the database is one main advantage of white box testing

Black box testing:

Verifies the data that’s incoming, data mapping and verifies the outgoing data from query function.

In my database tables I created I adopted the white box methods as this helped me run queries on the SQL and helped cut out the repetitions.  Unable to carry out black box testing as we are not able to deal with the backend of the coding.

 

17-07-2017-References

UKEssays. 2016. Traditional File Systems And Database Management Information Technology Essay. [ONLINE] Available at: https://www.ukessays.com/essays/information-technology/traditional-file-systems-and-database-management-information-technology-essay.php. [Accessed 28 April 2017.

What is the difference between a file system and a database? – Quora. 2016. What is the difference between a file system and a database? – Quora. [ONLINE] Available at: https://www.quora.com/What-is-the-difference-between-a-file-system-and-a-database.

Data Mining: What is Data Mining? 2016. Data Mining: What is Data Mining? [ONLINE] Available at:http://www.anderson.ucla.edu/faculty/jason.frand/teacher/technologies/palace/datamining.htm

SearchSQLServer. 2016. What is data mining? – Definition from WhatIs.com. [ONLINE] Available at:http://searchsqlserver.techtarget.com/definition/data-mining

SearchSQLServer. 2016. What is data warehouse? – Definition from WhatIs.com. [ONLINE] Available at:http://searchsqlserver.techtarget.com/definition/data-warehouse

Data Warehousing Concepts. 2016. Data Warehousing Concepts. [ONLINE] Available at:https://docs.oracle.com/cd/B10500_01/server.920/a96520/concept.htm

Cite This Work

To export a reference to this article please select a referencing stye below:

Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.

Related Services

View all

Related Content

All Tags

Content relating to: "Business Analysis"

Business Analysis is a research discipline that looks to identify business needs and recommend solutions to problems within a business. Providing solutions to identified problems enables change management and may include changes to things such as systems, process, organisational structure etc.

Related Articles

DMCA / Removal Request

If you are the original writer of this dissertation and no longer wish to have your work published on the UKDiss.com website then please: