Refactoring Copy/Paste Pattern with Data-Structures
Introduction
Following up the "Refactoring Copy/Paste Code With the Use of Delegates" I will ramble a bit more on how to refactor copy/paste code that exists in the wild.
Background
There are many situations where code blocks get repeated over and over again, if there are only a few occurrences that's not so bad. But, eventually there's a limit beyond more than a few that means you should try another approach.
One of the most common occurrences of repetition it's on projects where there are a lot of calls to stored procedures or queries, which is the case in most CRUD applications. And so, any class that has code for database access will have a lot of methods in which the only a small portion of code lines differs from each other.
This has a knack of making the code difficult to update, when changes have to be shared by all methods. There are countless opportunities for mistakes that arise from boredom or a botched search and replace edit.
The solution to this problem means that we have to separate the main flow from the data. And thus, refactoring this kind of pattern by creating custom data-structures that are adapted to the needs of one's application. Also, one can add some cleverness into them, making it easier to develop the handling code.
Exemplifying with Ado.Net
So, we start with this common pattern, as you can see it's mostly boiler plate code. And, if there are only a small number of occurrences, there's nothing wrong to use it in this way.
public DataSet GetPersonsInformationWithFilter(char filter_sex)
{
DataSet persons = new DataSet();
string connectionstring = ConfigurationManager.ConnectionStrings["default"].ConnectionString;
SqlConnection connection = new SqlConnection(connectionstring);
string query = "SELECT * FROM PERSONS WHERE PERSONS.SEX = @sex ;";
SqlCommand command = new SqlCommand(query, connection);
SqlDataAdapter adapter = new SqlDataAdapter(command);
SqlParameter sex_parameter = command.Parameters.Add("@sex", SqlDbType.Char);
sex_parameter.Direction = ParameterDirection.Input;
sex_parameter.Value = filter_sex;
try
{
connection.Open();
adapter.Fill(persons);
}
catch
{
// do some logging here ...
throw;
}
finally
{
connection.Close();
adapter.Dispose();
command.Dispose();
connection.Dispose();
}
return persons;
}
In this case it is relatively easy to refactor these type of methods through the use of data structures that model the database call operation. And this can be done while maintaining the method signatures so that the rest of the application code doesn't get broken.
Here are some simple data structures to model the SQL command and the SQL parameter, conveniently called QueryDefinition and ParameterDefinition.
The QueryDefinition class will be responsible for defining the database command call, be it a query (select, update, insert, delete) or a stored procedure call.
While the ParameterDefinition class will define the input/output parameters.
public class QueryDefinition
{
public string ConnectionSetting
{
get;
set;
}
public string CallText
{
get;
set;
}
public CommandType CallType
{
get;
set;
}
public List<ParameterDefinition> Parameters
{
get;
set;
}
}
public class ParameterDefinition
{
public ParameterDefinition(string name, SqlDbType dbType, ParameterDirection direction)
{
this.Name = name;
this.DbType = dbType;
this.Direction = direction;
}
public ParameterDefinition(string name, SqlDbType dbType, ParameterDirection direction, int size)
: this(name, dbType, direction)
{
this.Size = size;
}
public ParameterDefinition(string name, SqlDbType dbType, ParameterDirection direction, object value)
: this(name, dbType, direction)
{
this.Value = value;
}
public ParameterDefinition(string name, SqlDbType dbType, ParameterDirection direction, int size, object value)
: this(name, dbType, direction, size)
{
this.Value = value;
}
public string Name
{
get;
set;
}
public SqlDbType DbType
{
get;
set;
}
public int? Size
{
get;
set;
}
public ParameterDirection Direction
{
get;
set;
}
public object Value
{
get;
set;
}
}
These classes store the minimum requirements for building an Ado.Net command, these will be passed as a parameter and define an operation, like get a list of receipts or insert customer information.
I will also add a clever feature, adding a delegated method to deliver the output. This will make make the code more generic and it easier to reuse for most types of output.
protected T AccessDataBase<T>(QueryDefinition definitions, Func<SqlCommand, T> output_functor)
{
using (SqlConnection connection = new SqlConnection(
ConfigurationManager.ConnectionStrings[definitions.ConnectionSetting].ConnectionString))
{
try
{
connection.Open();
using (SqlCommand command = this.CreateCommand(definitions, connection))
{
return output_functor(command);
}
}
catch (Exception ex)
{
throw;
}
finally
{
connection.Close();
}
}
}
protected IEnumerable<T> AccessDataBase<T>(QueryDefinition definitions, Func<SqlDataReader, T> map_functor)
{
using (SqlConnection connection = new SqlConnection(
ConfigurationManager.ConnectionStrings[definitions.ConnectionSetting].ConnectionString))
{
using (SqlCommand command = this.CreateCommand(definitions, connection))
{
connection.Open();
using (SqlDataReader reader = command.ExecuteReader())
{
while (reader.NextResult())
{
yield return map_functor(reader);
}
}
connection.Close();
}
}
}
private SqlCommand CreateCommand(QueryDefinition definitions, SqlConnection connection)
{
SqlCommand ret_command = new SqlCommand(definitions.CallText, connection) { CommandType = definitions.CallType };
foreach (SqlParameter parameter in this.BuildParameters(definitions.Parameters))
ret_command.Parameters.Add(parameter);
return ret_command;
}
private IEnumerable<SqlParameter> BuildParameters(IEnumerable<ParameterDefinition> definitions)
{
foreach (ParameterDefinition item in definitions)
yield return this.BuildParameter(item);
}
private SqlParameter BuildParameter(ParameterDefinition definition)
{
SqlParameter retParam = null;
if (definition.Size != null)
retParam = new SqlParameter(definition.Name, definition.DbType,
Convert.ToInt32(definition.Size)) { Direction = definition.Direction };
else
retParam = new SqlParameter(definition.Name, definition.DbType) { Direction = definition.Direction };
if ((definition.Direction == ParameterDirection.Input || definition.Direction == ParameterDirection.InputOutput)
&& definition.Value != null)
retParam.Value = definition.Value;
return retParam;
}
As you can see there are two AccessDataBase methods, one for dealing with a single return, either object or value type, and IEnumerable for dealing with queries that output more than one row.
The method that returns an enumeration is also different cause its anonymous function is a mapping operation, so as it iterates through the data adapter class it will transform the selected data onto the expected return type.
I will now add some helper methods to populate the parameters list and setup a cleaned up body method responsible for calling the stored procedure.
These are over the counter and ready to use helper methods that do the most frequent operations.
protected DataSet AccessDbDataSet(QueryDefinition definitions)
{
Func<SqlCommand, DataSet> out_functor = (SqlCommand command) => this.GetDataSet(command);
return this.AccessDataBase<DataSet>(definitions, out_functor);
}
protected int AccessDbNonQuery(QueryDefinition definitions)
{
Func<SqlCommand, int> out_functor = (SqlCommand command) => command.ExecuteNonQuery();
return this.AccessDataBase<int>(definitions, out_functor);
}
In this case, I made the decision that the pattern of usage of this class would be by sub-classing, or deriving from it. But this can also be done through writing the class as an utility class.
So, here are the simplified methods, the class method signature can remain the the same but changes to the inner code and structure are more easy to perform.
public DataSet GetSomeProcedure()
{
QueryDefinition definition = new QueryDefinition()
{
ConnectionSetting = this._connection,
CallText = "ProcedureName",
CallType = CommandType.StoredProcedure,
Parameters = new List<ParameterDefinition>()
{
new ParameterDefinition("result",
SqlDbType.Structured, ParameterDirection.ReturnValue)
}
};
return this.AccessDbDataSet(definition);
}
public DataSet GetSomeQuery(int id)
{
QueryDefinition definition = new QueryDefinition()
{
ConnectionSetting = this._connection,
CallText = "select * from atable where ID=@id;",
CallType = CommandType.Text,
Parameters = new List<ParameterDefinition>()
{
new ParameterDefinition("@id",
SqlDbType.Int, ParameterDirection.Input) { Value = id}
}
};
return this.AccessDbDataSet(definition);
}
public IEnumerable<Person> GetPersons()
{
QueryDefinition definition = new QueryDefinition()
{
ConnectionSetting = this._connection,
CallText = "select name, address, age from persons;",
CallType = CommandType.Text,
Parameters = new List<ParameterDefinition>() { }
};
Func<SqlDataReader, Person> map_functor =
(reader) => new Person() {
Name = reader["name"].ToString(),
Address = reader["address"].ToString(),
Age = (int) reader["age"]
};
foreach (Person item in this.AccessDataBase<Person>(definition, map_functor))
yield return item;
}
The first consequence is that the code gets more terse, and you only write what you need to define input and outputs. So, less space errors.
Second, you have opened new ground for more interesting changes that can simplify even more the writing of your data layer, you can write a fluent interface to deal with whole process. You can also, if need to be, develop a serialization strategy without writing too much new code.
Points of Interest
This pattern of refactoring is not only useful for simplifying database access code, but any situation where the entry parameters are the only variant code section on the methods.
This kind of refactoring has other consequences as well, it means that one can serialize the data-structures and turn the whole process dynamic. Easing up the need for compiling new binaries when changes on the data structure don't break the code.
History
First submission - 06-01-2011
Update submission - 13-01-2011