asp tutorials, asp.net tutorials, sample code, and Microsoft news from 15Seconds
Data Access  |   Troubleshooting  |   Security  |   Performance  |   ADSI  |   Upload  |   Email  |   Control Building  |   Component Building  |   Forms  |   XML  |   Web Services  |   ASP.NET  |   .NET Features  |   .NET 2.0  |   App Development  |   App Architecture  |   IIS  |   Wireless
 
Pioneering Active Server
 Power Search





Active News
15 Seconds Weekly Newsletter
• Complete Coverage
• Site Updates
• Upcoming Features

More Free Newsletters
Reference
News
Articles
Archive
Writers
Code Samples
Components
Tools
FAQ
Feedback
Books
Links
DL Archives
Community
Messageboard
List Servers
Mailing List
WebHosts
Consultants
Tech Jobs
15 Seconds
Home
Site Map
Press
Legal
Privacy Policy
internet.commerce














internet.com
IT
Developer
Internet News
Small Business
Personal Technology
International

Search internet.com
Advertise
Corporate Info
Newsletters
Tech Jobs
E-mail Offers

HardwareCentral
Compare products, prices, and stores at Hardware Central!

Get Your Data Faster with a Data Cache
By John Watson
Rating: 4.0 out of 5
Rate this article


  • email this article to a colleague
  • suggest an article

    Introduction

    Storing frequently used lookup data in a database is a great idea (e.g. order status codes, state names, etc.) that saves tremendous amounts of time in design and maintenance. However, retrieving that data from the database every time it is needed is very inefficient. This article describes how to use Application variables to cache frequently used lookup data in memory to achieve lightning fast access times. In my tests, I've seen as much as a 5000% increase in performance.

    This is a topic I've seen covered by others in the past but never in enough detail to satisfy me. This technique is extremely powerful and can yield huge performance gains over non-cached queries and therefore deserves more attention.

    Here's the scenario. You have data stored in your database that is accessed frequently but is rarely changed. By "rarely", I mean relative to how often it is accessed. For the sake of simplicity and the examples to follow, let's say the data consists of shipping methods for an e-commerce system. The table looks like this:

    
    create table ShippingMethod (
    	[Method] char(2) not null,
    	[Description] varchar(50) null,
    	...etc...
    )
    
    
    Method is the primary key of this table and is stored along with every order. Whenever a customer is placing an order on this site, they will need to choose a shipping method. The traditional way of displaying this information is to query the database and return the results, thus showing the available methods. This requires opening a database connection, running a query on the database server, and returning the results to the web server. The query may go to the disk drive or the data may be cached in the database server's memory. But, a roundtrip to the database server, possibly including opening a new connection, was still necessary to get the records.

    Here's the dilemma: the retrieval process is slow but the database is the ideal way to store the information. What's a performance conscious webmaster to do?

    Step 1: Caching the data

    Caching the data in the web server is really a two-phase process. You must first retrieve the data from the database and store it in memory (caching); second, you have to have some way of retrieving that data from memory when you need it. These two steps can be easily encapsulated in two ASP function calls.

    First, let's think of how we'll store the data in memory. Database tables are usually 2-dimensional objects so a 2-dimensional array will work nicely. It turns out that ADO has a recordset object method that will return a recordset as a VBScript array. Handy! Here's the function:

    
    01:sub cachedata()
    02:	dim rsData, aData
    03:	
    04:	' Put valid data into application("GLOBAL_SHIPPING_UPDATE")
    05:	if (not isdate(application("GLOBAL_SHIPPING_UPDATE"))) then
    06:		application.lock
    07:		application("GLOBAL_SHIPPING_UPDATE") = #12/31/1990#
    08:		application.unlock
    09:	end if
    10:		
    11:	' If last update more than 5 minutes old then refresh
    12:	if (abs(datediff("n", application("GLOBAL_SHIPPING_UPDATE"), now())) >= 5) then
    13:		' Get data
    14:		set rsData = server.createobject("adodb.recordset")
    15:		rsData.open "select * from ShippingMethod", DBCONN
    16:		if (not rsData.eof) then aData = rsData.getrows()
    17:		rsData.close	
    18:		set rsData = nothing
    19:		
    20:		' Store in global memory
    21:		application.lock
    22:		application("GLOBAL_SHIPPING") = aData
    23:		application("GLOBAL_SHIPPING_UPDATE") = now()
    24:		application.unlock
    25:	end if
    26:end sub
    
    
    

    Line 12 is arguably the most important line in the function. It is the line that determines if a database query is run or not. Without line 12, this function would simply do a query and you would get no performance gain. Simply, line 12 says, if the data has been cached less than 5 minutes ago, then don't bother trying to cache it again. In other words, we've got a fresh copy so don't both reloading the data.

    If the cached data is more than 5 minutes old -- it's stale -- then refresh the data. Lines 14-18 take care of this task. A database connection is opened and a query is run to retrieve the recordset we want to cache -- in this case, the contents of the ShippingMethod table. Line 16 uses the getrows() method to store the contents of the table in an array, aData.

    Finally, the data is cached in memory. This is done through the use of an application variable. Lines 21-24 store the data in memory. Line 22 stores the actual data and line 23 stores the time that the data was saved. Line 23 is critical as well since it is the time used by line 12 to determine if the data is "fresh" or "stale" and needs to be re-cached.

    And that's it. Lines 5-9 simply store a valid time in application("GLOBAL_SHIPPING_UPDATE") that is in the past to ensure that the comparison on line 12 will not cause any errors. These lines are necessary to make sure that the data is cached the first time someone attempts to access it after the server has been started.

    For this example we've set our "freshness meter" at five minutes. This means that, at most, the data will be refreshed at five-minute intervals (less often if users are accessing the data less frequently). The implication is that if the data in the database is changed, then users may not see the change for up to five minutes.

    On to using the data.

    Step 2: Retrieving the data

    This function will vary greatly depending on your needs but it should always have two main steps: 1) call cachedata() and 2) retrieve the data from the global application array. The function should call cachedata() every time -- cachedata() will decide for itself whether or not it needs to actually refresh the data. This way, you can have multiple display functions using this cache without having to duplicate any of the caching logic. Here's an example function for our ShippingMethod table:

    
    01:sub drawMethods()
    02:	dim aData, i, sMethod, sDescription 
    03:	
    04:	' Get the latest data
    05:	call cachedata()
    06:	
    07:	aData = application("GLOBAL_SHIPPING")
    08:	if (not isarray(aData)) then
    09:		response.write "Error: No data available."
    10:		exit sub
    11:	end if
    12:	
    13:	response.write "<select name=""frmShipping"">"
    14:	for i = 0 to ubound(aData,2)
    15:		sMethod = aData(0,i)
    16:		sDescription = aData(1,i)
    17:		response.write "<option value=""" & sMethod & """>" & sDescription
    18:	next
    19:	response.write "</select>"
    20:end sub
    
    

    Line 5 calls cachedata() which does its magic to cache our data in server memory for us. Lines 7-11 get the cached data into a local array and validate the data. If the variable is not an array then something has gone wrong so display an error message and exit. Finally, lines 13-19 display the data for us. In this case, by writing out a <select> tag with all of the available shipping methods.

    Using this function as a template, you should be able to write multiple functions for accessing the same cached data by simply rewriting lines 13-19. For example, one version for displaying all shipping methods as a <select> box and another for displaying radio buttons.

    The Payoff

    Ok, that's all fine, you say, but what's it get me? I mean, it seems like a lot of extra work to do a simple query. Good question. Rather than just taking my answer on faith, we'll do a little more programming and you can prove it to yourself.

    First, let's write two functions to measure the performance of retrieving the data. The first function will use the traditional method of running a database query to get the data. The second function will use the cache. Then we'll time them and see which is faster and by how much. Neither function will actually display any of the data since all we really want to measure is the speed of retrieval.

    This function uses a traditional database query to retrieve the data:

    
    sub getDataDB()
    	dim aData, rsData
    
    	' Get the data
    	set rsData = server.createobject("adodb.recordset")
    	rsData.open "select * from ShippingMethod", DBCONN
    	if (not rsData.eof) then aData = rsData.getrows()
    	rsData.close	
    	set rsData = nothing
    end sub
    This function uses cachedata() and retrieves the data from memory:

    
    
    sub getDataCache()
    	dim aData
    
    	' Populate the cache
    	call cachedata()
    	
    	' Get the data
    	aData = application("GLOBAL_SHIPPING")
    end sub
    
    

    Finally, this function will now call each of the above functions many, many times and time how long it takes. The elapsed time will be divided into the number of runs to calculate executions per second -- i.e. whether this caching thing is really worth it.

    
    sub benchmark()
    	dim t1db, t2db, tdb, t1cache, t2cache, tcache, i, n
    	
    	n = 1000
    	t1db = 0
    	t2db = 0
    	tdb = 0
    	t1cache = 0
    	t2cache = 0
    	tcache = 0
    	
    	' Time getDataDB()
    	t1db = cdbl(timer)
    	for i = 1 to n
    	call getDataDB()
    	next
    	t2db = cdbl(timer)
    	tdb = cdbl(t2db - t1db)
    	
    	' Time getDataCache()
    	t1cache = cdbl(timer)
    	for i = 1 to n
    	call getDataCache()
    	next
    	t2cache = cdbl(timer)
    	tcache = cdbl(t2cache - t1cache)
    	
    	response.write "Traditional elapsed time: " & tdb & "<br>"
    	response.write "Traditional exececutions/s: " & round(n/tdb, 2) & "<br><br>"
    	response.write "Cache elapsed time: " & tcache & "<br>"
    	response.write "Cache exececutions/s: " & round(n/tcache, 2) & "<br><br>"
    	response.write "<b>Performance increase: " & _
    		round(((n/tcache - n/tdb)/(n/tdb))*100, 2) & "%</b> <br>"
    end sub
    
    
    

    To run this benchmark, simply paste all three subroutines into an ASP page and add a call to benchmark(), e.g.:

    
    <html>
    <body>
    <%
    ...
    ' subs go here
    ...
    
    call benchmark()
    %>
    </body>
    </html>
    
    
    

    Two things to keep in mind: first, it may take a long time to run. If it takes too long, try reducing the number of runs (n) and/or increasing the script timeout value. Second, even though this benchmark is already taking the average of many runs, you need to run the benchmark several times consecutively so that the numbers can "stabilize" -- I suggest at least three times. The main reason for this is that your database probably doesn't have the number of open connections necessary to service this many requests. So, the first time you run the benchmark, a lot of time will be spent opening database connections. This unfairly penalizes getDataDB() since you aren't likely to actually do this many sequential queries in a real-world scenario; and if you are doing this many transactions, then the database connections would already be open anyway.

    Here are the results I achieved with this benchmark on a PIII-600 with 256MB of RAM running IIS4.0 and SQL7.0 (n = 1000). The database and web server were on the same machine but similar performance gains were achieved with a distributed configuration. As I said earlier, throw out Run 1.

    Method

    Measurement

    Run 1

    Run 2

    Run 3

    Run 4

    Database

    Elapsed

    18.1953125

    7.12109375

    7.109375

    7.0625

     

    Executions/s

    54.96

    140.43

    140.66

    141.59

    Cached

    Elapsed

    0.12109375

    0.1171875

    0.12109375

    0.12109375

     

    Executions/s

    8,258.06

    8,533.33

    8,258.06

    8,258.06

     

    Performance Gain

    14,925.81%

    5,976.67%

    5,770.97%

    5,732.26%

    About the Author

    John Watson is the owner and chief consultant of Watson Consulting, an ASP contract programming and consulting company in southern California. Watson Consulting has been using ASP since its introduction into IIS to create innovative and robust Web application solutions for our clients. Please visit our Web site to learn more about Watson Consulting or our flagship product Active Publisher, an extremely powerful yet affordable ASP web site content management system.

  • Rate This Article
    Not HelpfulMost Helpful
    1 2 3 4 5
    Supporting Products/Tools
    Proposion N2N
    Proposion N2N connects Microsoft .NET applications to Lotus Notes and Lotus Domino databases. This ADO.NET managed data provider allows you to perform blindingly fast queries and updates of Notes data from ASP.NET pages, .NET web services, Windows, or Mobile applications. An innovative SQL-like query language leverages the unique features of Notes and makes collaborative software accessible to relational database programmers.
    [Top]
    Other Articles
    Sep 15, 2005 - Building an Image Keyword System
    Unlike text-based file formats image files aren't made up of words, which makes searching for an image file by keyword difficult. Instead of being able to simply open the file to see what it contains, we're stuck looking at the text around it and other metadata to determine the image's meaning. In this article, Ziran Sun shows you how to build a simple database-based image keyword system that allows you to associate keywords with images and use these keywords to make finding images easier.
    [Read This Article]  [Top]
    Apr 7, 2005 - A Step-by-Step Guide To Using MySQL with ASP.NET - Part 2
    In the second part of of his article on using MySQL with ASP.NET, Ziran Sun covers how to add a new MySQL user to the database server, assign the user the appropriate permissions, connect to the database, and build a simple ASP.NET page to perform a query.
    [Read This Article]  [Top]
    Feb 10, 2005 - A Step-by-Step Guide To Using MySQL with ASP.NET - Part 1
    Back in the days of classic ASP, if you were building a database-driven web site, your choice was either to invest a lot of money to get a copy of Microsoft SQL Server (or some other enterprise-ready database) or invest a lot of time finding a way to deal with the performance and scalability limitations of Microsoft Access. Luckily these days there's another viable alternative: MySQL.
    [Read This Article]  [Top]
    Jan 27, 2005 - Moving a Database from SQL Server 7.0 to SQL Server 2000
    Moving or copying a SQL Server database from one machine to another requires a lot of preparation in order to ensure a smooth transfer. In this article, Dina Fleet Berry examines the different methods and highlights the different issues associated with each of them.
    [Read This Article]  [Top]
    Jan 6, 2005 - Debugging a SQL Stored Procedure from inside SQL Server 2000 Query Analyzer
    There are many times when using SQL Server 2000 Query Analyzer to debug SQL statements is a better choice than debugging in Visual Studio .NET. In this article, Dina Fleet Berry explains why and walks you through the debugging process step-by step.
    [Read This Article]  [Top]
    Nov 24, 2004 - Persisting .NET Objects to SQL Server Using SQLXML and Serialization
    As a follow up to his article on retrieving objects from SQL Server using SQLXML and serialization, Gianluca Nuzzo discusses saving objects back to SQL Server using a schema definition file and updategrams.
    [Read This Article]  [Top]
    Sep 14, 2004 - Transaction Processing in ADO.NET 2.0
    One area that stands out when comparing ADO.NET 1.x to ADO.NET 2.0 is transaction processing. Bill Ryan shows just how easy transaction processing has become with the TransactionScope object in ADO.NET 2.0.
    [Read This Article]  [Top]
    Sep 8, 2004 - Custom Object Data Binding with .NET
    Developers often use brute force coding to marshal data between the GUI and application objects. In this article, Luther Stanton explains how to use .NET's out-of-the box data-binding functionality to make this job much easier.
    [Read This Article]  [Top]
    Sep 2, 2004 - Queue MSMQ Messages from SQL Server
    Learn how to create a console application to queue a message in Microsoft Message Queuing (MSMQ) and then use an extended stored procedure to call the console application from a SQL Server trigger.
    [Read This Article]  [Top]
    Aug 30, 2004 - Tuning Up ADO.NET Connection Pooling in ASP.NET Applications
    Connection pooling increases the performance of Web applications by reusing active database connections instead of creating a new connection with every request. This article shows how to monitor the connection pool, diagnose a potential problem, and apply the appropriate fix.
    [Read This Article]  [Top]
    Mailing List
    Want to receive email when the next article is published? Just Click Here to sign up.

    Support the Active Server Industry

    internet.comearthweb.comDevx.commediabistro.comGraphics.com

    Search:

    Jupitermedia Corporation has two divisions: Jupiterimages and JupiterOnlineMedia

    Jupitermedia Corporate Info

    Legal Notices, Licensing, Reprints, Permissions, Privacy Policy.
    Advertise | Newsletters | Tech Jobs | Shopping | E-mail Offers