Which Database?
Do you need a database but you're not sure which one to use: MS Access, SQL Server, Azure SQL, MySQL, etc?
Or perhaps you have an MS Access database and you're wondering whether to upsize to SQL Server, or migrate to the cloud (Azure SQL)?
"Database" is often used in general discussion to refer to a complete application or program that manages data. i.e. That includes the data storage tables plus the interface elements: the queries, forms, reports and menus. However in this discussion we use "database" to refer to just the component of an application that stores the actual data. i.e. That does not include any user interface elements. (If it is a server database it may include code such as stored procedures and triggers and these will be touched on later).
Choosing an appropriate database is a critical step in application development.
There are situations where a MS Access database is adequate and the higher cost and complexity of hosting a database in a data server like SQL Server or MySQL is unnecessary. But there are also cases in which using a data server is essential.
The discussion that follows aims to help you decide whether to store your data in a "file database" like MS Access or to host it in a data server such as SQL Server or MySQL.
Before we compare MS Access, SQL Server and MySQL let's briefly consider MS Access alone.
A complete MS Access database solution, including tables, queries, forms, reports and menus can be developed in a single file (mdb or accdb) but this is rarely done because the majority of databases will have multiple users and this architecture is not suitable for multi-user operation.
So instead, with the typical MS Access application architecture:
- there are two files:
- a "Data File" which contains the tables,
- an "Interface File" which contain the menus, forms, reports and queries;
- the Data File resides on a server;
- a separate copy of the Interface File exists on each user's computer and must know the name and location of the Data File;
- when a user opens a form, report or query they are effectively opening one or more connections to the Data File in order to view, create, modify or delete data;
- there may also be a Workgroup Security File if the Data File is an mdb file and workgroup security has been implemented.
If the Data File is moved to a different location (such as a different folder or even a different server) each Interface File needs to be "reconnected" to the Data File and users will then continue to use the application as they did before. Reconnection can be achieved manually via the wizard that is built into MS Access or automated via the execution of some VBA (Visual Basic for Applications) code.
Moving the Data File, deleting the entire Data File or copying the Data File (e.g. to CD, USB device or another computer) are as simple as moving, deleting or copying any other file via Windows File Explorer. This simplicity is both one of the advantages and one of the disadvantages of storing data in a MS Access file. (An example of advantage would be the case of wanting a database that can be easily emailed to different locations/personnel in it's entirety).
An alternative architecture is to still use the MS Access Interface File but to host the database in a data server such as SQL Server or MySQL. It will then be necessary to pass through the strong security provided by the data server in order to move, delete or copy the data.
We have produced large database systems, using solely MS Access for both the Interface File and the Data File, that include security options for users (to limit the particular data that each user can view, create, edit and delete) and these systems continue to be used very effectively in both government departments and commercial businesses. Using this architecture allowed us to deliver a very comprehensive, powerful and flexible system at relatively low cost.
There are of course many alternative interfaces (or "frontends") to SQL Server or MySQL databases (such as .Net Winforms and Webforms, ioS or Android apps, etc) but this discussion is about the data store, not the interface.
To help you consider the relative merits of using MS Access, SQL Server or MySQL for your database this discussion is broken into the following considerations:
- Proprietary or Open Source
- Processing Location
- Maximum Data Volume
- Maximum Concurrent Users and Robustness
- Scalability
- Other Limits
- Security
- Performance / Speed
- Transaction Logging and Audit Trail
- Stored Procedures, Queries, Views and Triggers
- Backup and Recovery
- Uptime
- Personnel to Install, Optimise and Maintain
- Licence Cost
- Server or Cloud
- IT Department Perception
- Semantics
- Questions
Proprietary or Open Source
This part of the discussion relates only to data servers such as MS SQL Server and MySQL, not to file databases like MS Access.
If you must have a data server, for any of the reasons discussed in the sections below, and you are committed to using Microsoft technology then the choice of data server is simple: the proprietary MS SQL Server.
A number of editions of SQL Server are available - from the free Express Edition to the Web Edition and the top-level Enterprise Edition.
Open source alternatives include MySQL and MariaDB. (There is a also a proprietary version of MySQL).
"Open source" tends to sound like something you get free but it's not quite that simple. MySQL (the open source version) and MariaDB are offered under the GPL which allows users to "run, study, share and modify the program" at no cost.
So superficially they are free. However, database developers and end users don't want, or have the time, to view or modify the source code behind the data servers. Instead, in addition to features, they want reliability, maintainability and support - and these are offered via editions for which there is an annual fee.
The key objective of offering software, including MySQL and MariaDB, under the GPL is to encourage developers who have the relevant knowledge and skills to contribute to bug-fixing and ongoing improvement.
So only Microsoft works on MS SQL Server whereas any developer anywhere in the world can contribute to the ongoing development of MySQL (open source version) or MariaDB.
According to Wikipedia:
- MySQL was created by a for-profit Swedish company in the 1990's,
- was acquired by Sun Microsystems in 2008;
- and then, despite the creation (by a MySQL founder) of a movement to prevent or limit it, was acquired by Oracle Corporation in 2009.
Since Oracle's acquisition was apparently unconditional (despite attempts to ensure that MySQL would continue to be available under the GPL forever) it would seem that they are free to set the licensing terms and fees.
According to the MariaDB website: "MariaDB was created in 2009 by the same people behind MySQL, one of the most popular Open Source relational databases. The founders were inspired to develop MariaDB as an alternative to MySQL, after Oracle acquired MySQL through the Sun Microsystems acquisition."
According to Wikipedia, MariaDB:
- was started as a GPL-only fork of MySQL,
- is based on the same code as MySQL server 5.5, and
- "aims to maintain compatibility with Oracle-provided versions".
MS SQL Server, MySQL and MariaDB are all widely used, offer both free and paid versions, and can be used behind simple local applications through to high-availability websites and mobile apps.
Processing Location
The frontend, or interface, of an application is typically referred to as the "client" of the database.
"Client" can also refer to the computer on which the frontend or interface is displayed.
With MS Access, all processing is done in or on the client. i.e. All code is executed, and all queries are run, within the frontend or interface.
When the database is in a data server like SQL Server or MySQL there is a certain amount of processing that can occur there before data is transmitted across the network.
Consider the example of a table of employees containing 10,000 records, and an application user who only wants to see a list of the 150 employees in a particular department. With an MS Access database on a server and a MS Access frontend/interface on the user's computer the full 10,000 records will be processed by the client to extract the 150 required, so 10,000 records will have to travel across the network. If instead the database is in SQL Server the extraction of the 150 records can occur on the data server and only 150 records will travel across the network to the client for display there.
There is one twist to this: thin client. It is possible for a MS Access frontend/interface to be installed on a server and deployed to users via thin client technology which, in simple terms, means that just the image of query results, forms and reports (not actual data) travels across the network for display on the user's device. In this case the client device, from an MS Access perspective, is actually the server computer and the extraction of the 150 records from the 10,000 would occur there.
Local area network (LAN) speeds are now so great that this might not be very important for an application that will be used only within one site and by a moderate number of users but for a database accessed from multiple sites, and most definitely for a database accessible by large numbers of users across the internet, this is a critical consideration.
Maximum Data Volume
MS Access has a maximum database size of 2Gb. Realistically the database should be upsized into SQL Server or another data server long before it reaches this limit.
Data servers measure maximum database size in terabytes. That's a lot of data!
The free version of SQL Server 2016 ("Express") has a lower data limit of just 10Gb. A big advantage of using this to run a small database in preference to MS Access is that if the database grows significantly to the point that it needs a full SQL Server licence there will be no change required to the codebase and interface.
(A comparison of other specifications for the various editions of SQL Server 2016 is available here.
Maximum Concurrent Users and Robustness
MS Access allows for a maximum of 255 concurrent users. However, as with the maximum data volume, a MS Access database should be upsized into SQL Server or another data server long before it reaches this limit.
Based on our observations of MS Access behaviour, plus the experiences of users of our MS Access applications and comments from the global developer community, we'd be recommending a move from data storage in MS Access to SQL Server when concurrent user numbers exceed about 25. That number is not "set in concrete" and is not based on rigorous testing; it is just our general judgement. Another respected supplier of MS Access solutions recommends moving to SQL Server when user numbers exceed 100. However our opinion is that, while MS Access may be capable of sustaining up to 255 users, other considerations such as the increased potential for corruption, the higher value of the data (due to the increased number of users impacted if there is an outage), and transaction logging and data recovery provide the dominant reasons for the use of SQL Server instead of MS Access when CONCURRENT user numbers exceed 25 or so.
SQL Server 2016 has a limit of 32,767 user connections.
The term commonly used in relation to reliable performance with high numbers of concurrent users is robustness. SQL Server, and other data servers like MySQL, have been designed from the ground up to handle thousands of simultaneous connections whereas MS Access is more likely to suffer corruption with considerably fewer users (for example due to something as common as a broken connection from a desktop computer being turned off without properly exiting programs).
Robustness is a key consideration when a database is accessed across the internet or from multiple geographically separate locations.
Consider the case of a power failure on a user's computer when they have an application open that is in the process of entering data into a MS Access database that is shared across multiple sites. What actions are needed? Potentially this has corrupted the database and the correct course of action is to perform a "repair and compact" operation on the database. However this is only possible if all users close all applications that are using the database. This means that the user who suffered the outage either has to contact all potential users, or get an administrator to do so. Across multiple sites this is clearly undesirable. Within a single site it would be an annoyance but much more readily achievable. (It is easy to check that all users have closed connections to the database by the absence of, or ability to delete, a "locking file" in the Data File folder).
By comparison, data servers like SQL Server will cleanly handle broken connections and avoid data corruption.
A robust data server is essential for databases behind websites, and for mission-critical enterprise applications used 24x7.
Scalability
Scalability is the ability of a database to handle a significant increase in the number of concurrent users and/or the amount of data stored.
The limits on each of these have been discussed above.
Clearly if you know that the user numbers and/or data size will ultimatedly require a data server like SQL Server it is wise to start with this.
It is possible to "upsize" an MS Access database to SQL Server, and there are tools available to help with this, but there will always be time and cost (and potential disruption to users) associated with this.
So the key is to understand upfront the likely number of users and/or amount of data and make the right decision for the database to be used.
However, if an application has started life with an MS Access database and becomes more popular and more useful than originally anticipated the decision will have to be made on how best to scale it to meet the higher demands. This will undoubtedly include migration to a data server like SQL Server or MySQL but the decision will also have to be made on whether to modify the MS Access interface to work with the new database or to create a completely new interface (e.g. with .Net).
Scalability doesn't just refer to jumping from 100 users to 1000 users, or from 50Mb to 50Gb of data. With high availability websites the jump might be from 10,000 concurrent users to 100,000 or more, or from 5Gb to 5Tb of data. In these cases we are not talking about upsizing from MS Access to SQL Server; what we are talking about is the ability of the data server to service these large numbers and to offer performance tuning options to ensure the most responsive end-user experience.
Other Limits
Technical specifications are available for MS Access, SQL Server and other data servers that set out numerous limits such as the number of objects in a database, the number of fields per table, etc.
In the twenty years we've spent building MS Access applications one limit that has been encountered many times is the maximum number of indexes on a single table. The figure is 32. While that sounds like quite a lot it includes both the indexes defined by the developer (e.g. on date fields) and also an index for every relationship in which the table is the parent or child. On a number of occasions this limit has necessitated splitting a table into several linked tables - something that would have been unnecessary in SQL Server.
Security
As already described above, a MS Access application will typically have an Interface File on user's computers plus a Data File on a server.
In order for users to operate the application and work with data stored in the Data File they must have been granted read/write permissions by the server administrator on the folder that contains the Data File plus the file itself. In addition, if the Data File is an mdb file, and workgroup security has been applied to it, there will be a Workgroup Security File (with an mdw extension) that normally resides in the same folder as the Data File, and read/write permissions will also have to be granted on this file to the application users. (Use of workgroup security allows users to be assigned to groups that grant read, create, edit and/or delete permissions to specific tables. It cannot limit access to subsets of records within tables).
The necessary folder/file permissions are at the heart of the biggest security problem with MS Access. Because application users must have read/write permissions over the Data File, and Workgroup Security File if present, these files can be copied to a CD, USB storage stick, or other media, or even emailed. That allows the user to take the data offsite. Of course the server permissions limit the personnel who can use and access these files, and the Interface File supercially isolates users from the Data File (and Workgroup Security File), but file theft can still occur.
With SQL Server (and other data servers) this cannot happen because the data server software is like a gatekeeper who much more stringently controls access to the data. More granular options exist for granting data access permissions so, for instance, a user granted access to read/write data might not also be granted permission to backup a database, or copy or move an entire database. Access to data can also be controlled via views or stored procedures that limit users' access to just specific parts of particular tables or to subsets of the records in tables.
SQL Server offers two logon (or "authentication") modes. With the "Windows Authentication" mode a user's MS Windows logon, which grants access to the organisation's network and to specific folders and files, is also the logon to SQL Server and will determine which databases and data the user can access (and also what administrative actions can be performed). This avoids having a separate logon for each database used.
Our OSHatWork program, that is used around Australia to manage WHS, risk and training, is built with MS Access. The frontend is MS Access and the data is stored in a MS Access data file. Flexible security options are provided via carefully defined security groups, and each user is granted membership of one or more of these groups. e.g. If a user is given a read-only membership of the Incidents group they will be able to view, but not change, incident data. If another user is given read-write membership of the same group, they will be able to view, create, edit and delete Incident data. In order to provide these security options these applications use the mdb file format.
MS Access has served us, and our clients, very well for many years but clearly there are important reasons, including security considerations, why hosting your data in a data server such as SQL Server or MySQL may be more appropriate than in a MS Access database.
Performance / Speed
On the surface of it you might expect an MS Access database that is upsized to SQL Server to perform faster. However the operations that occur within the data server to deliver the security, transaction logging and other features might actually result in slower performance.
So speed should not be a reason to use SQL Server in preference to MS Access. In fact, if the application is just being used over a local area network, doing processing (such as running a query) on the client device (i.e. in the application frontend/interface) may be signficantly faster than if it was done on the server. When MS Access is used as the frontend/interface to a SQL Server database options exist to perform some processing in the data server and some in the frontend/interface to achieve the best possible performance.
One area in which SQL Server (or MySQL) may significantly outperform MS Access is in running complex queries involving multiple subqueries.
Stored procedures within a data server also allow a sequence of operations to be performed, using the processing power of the server, with just the result returned to the frontend/interface.
Of course if a database is being accessed across the internet it is critical to minimise the volume of traffic and in these cases processing on the data server will ensure that only the data that needs to be transmitted is transmitted.
Transaction Logging and Audit Trail
The transaction logging available in data servers like SQL Server allows transactions to be "rolled back" to a previous point in time (e.g. to remove incorrect data changes).
It can also be used in data recovery by allowing data changes to be applied to a previous backup to bring it "up to date".
Transaction logging also provides an "audit trail" that enables an administrator to see which users made which changes to data.
MS Access does not have these features though an audit trail can be created, by an application developer, via code in the interface/frontend plus suitable tables in the Data File to log the changes.
Stored Procedures, Queries, Views and Triggers
Enforced relationships (with options of cascade delete/update), primary keys, foreign keys and indexes are relatively similar in MS Access databases and the databases hosted in data servers.
But what SQL Server and other data servers also offer are views, stored procedures and triggers.
Views allow the data server to deliver data to the client from joined tables, or to deliver a subset of the data in a table or tables. This reduces network traffic.
Stored procedures can comprise one or several coded instructions to return or manipulate data.
Triggers fire when events occur in the database (such as insert, update, delete, etc) and can be used to enforce data integrity by preventing a change or modifying other data in response to a change.
The availability of views, stored procedures and triggers is very much a part of the client-server model in which data-specific intelligence is embedded in the data store and presentation logic is in the frontend/interface (with additional business logic sitting between these). With a MS Access database the corresponding intelligence/logic is almost entirely in the frontend/interface which is why so much more data passes across the network with this architecture. (In the accdb file format MS Access does offer "data macros" which are somewhat similar to triggers).
Backup and Recovery
Catastrophic hardware or software failures (or even lockouts created by malware) can create the need to restore data from backups. MS Access databases can also suffer from corruption (for example due to a power outage on a user computer).
With a data server like SQL Server or MySQL a process is normally run automatically at intervals (usually at least daily) to create a backup of user data and system data/settings. Transaction logging will also create a continuous record of data changes.
A MS Access database can also be backed up via an automated process but there is a big difference.
When a MS Access database is backed up what occurs is simply that a copy is made of the Data File (and the Workgroup Security File if one exists). The problem with this is that if any application is using the Data File, and particularly if it is in the middle of a write operation, when the file is copied the resulting backup copy could be corrupted. While it may be possible to create a custom backup solution that can determine whether the database is in use before trying to copy it this assumes that there is some point in each 24 hour period when the database should not be in use - which may not be an issue when it is used at only one location but becomes problematic when the database is accessed from multiple time zones.
By comparison, backup from data servers like SQL Server can occur while databases are in use. Typically the option also exists to perform incremental backups. So if the volume of new/modified data each day is a relatively small proportion of the overall size of a database it may be decided to do an incremental backup each day and then a full backup once per week (or longer). This avoids generating a massive volume of backups.
Transaction logging in data servers also provides considerable flexibility when the time comes to perform a data recovery and avoids loss of data. With MS Access however the only recovery process possible when a database has become corrupted beyond repair is to replace the Data File completely with the most recent backup, and of course the consequence of this is that any data changes made between the date/time of the most recent backup and the date/time of the discovery of the corruption will be lost.
Fortunately a corrupted MS Access database can often be repaired via the builtin "repair and compact" facility or via the separate JetComp program. Alternatively importing the data into a new Data File can sometimes eradicate the corruption. There are also third-party programs available to fix corruption. In 20 years of developing and using MS Access applications, and supplying our OSH At Work application, we have struck only one instance when a third-party utility was required to fix a corruption (and that was for an Interface File, not a Data File). We have never struck an instance of a corrupted MS Access database that could not be restored by one of the methods noted here. However all corruptions result in an interruption to the work of users which is why for a mission-critical database that is needed 24x7 a data server like SQL Server is a better choice than an MS Access database.
Uptime
Data servers like MS SQL Server and MySQL are designed for 24x7 uptime. Maintenance operations and backup can be performed while the served databases are in use. By comparison, for MS Access it is necessary to get all users to close all applications that are using a database before it can be backed up or before a "repair and compact" operation can be performed.
Personnel to Install, Optimise, Maintain
Personnel for MS Access
Installing and using a MS Access application with a MS Access database does not require any IT specialists.
MS Access itself is installed along with the other programs in the MS Office suite such as MS Word and MS Excel. (If your copy of MS Office didn't include MS Access there is a free runtime version available that allows applications to be run).
The files that make up each MS Access application (the Interface File, Data File, etc) are simply copied to their required locations on user computers and the server.
The only part that might be considered tricky by some is the creation of startup shortcuts when the application in question also uses an mdw file to control the availability of application objects or data to particular users. With our commercial MS Access application products we supply an installer (created with NSIS) that creates these shortcuts but it can be done manually by an end user with appropriate care.
Personnel for SQL Server / MySQL
When a data server (like SQL Server or MySQL) is used in a large organisation there will generally be at least one person assigned to a Database Administrator (DBA) position. This person will oversee the initial installation, occasional update and ongoing maintenance of the server software. They will also oversee the deployment of databases into the data server, will undertake performance assessment and optimisation, and will configure and respond to alerts generated by the data server.
Small/medium organisations would not be able to justify a full-time DBA so these tasks may be performed by the general network and systems administrators or by external consultants who are brought in as needed.
Licence Cost
Many organisations have MS Access as part of their licences for the MS Office suite.
Even if an organisation does not have MS Access as part of MS Office there is a free runtime version of MS Access available that can be used to run completed applications (but not undertake development work).
Although there is a free version of MS SQL Server available it is more likely that an organisation using it for multiple databases will want a paid subscription licence in order to access the general support, and the assistance with upgrading to new versions (and access to high-end features in some cases), that this provides.
Server or Cloud
It wasn't too many years ago that you needed to have your own server computer (running Windows Server or Linux as appropriate) in order to use a data server like SQL Server or MySQL. There are significant costs of acquiring, establishing and maintaining such a server.
But now you have the choice of having your data server hosted "in the cloud".
Microsoft's cloud equivalent of SQL Server is called Azure SQL.
A Microsoft Access application can be hosted within an Azure account, and linked to data in Azure SQL. This allows a Microsoft Access application to be deployed to users over the internet, using only a narrow bandwidth, and therefore with good performance.
IT Department Perception
The ease of development in MS Access, and the freedom for end users to create database applications without the involvement of the IT Department, resulted in organisations having numerous database applications that had been designed and built by people who had no formal training in the skills required to do it well. These applications were fine when they were just used by their creator to solve a basic business requirement but when they needed to be used by more people, or when the creator had left the organisation and changes were needed to it, the IT Department got involved and frequently discovered something that wasn't to professional standards and might be riddled with bugs or contain code that was extremely difficult to understand or maintain.
Sadly this resulted in MS Access applications getting a bad name within IT departments (particular in large organisations). However when we (or any other equally capable, professional, tertiary-educated developer) creates an MS Access application the end result can be a comprehensive, cost-effective, user-friendly and readily maintainable and extendable system. Certainly that is the case with the products and custom solutions we've developed - that have been used in government departments and private companies for close to 20 years. (See a reference for our work) .
So don't be dissuaded from using MS Access by someone who has had a bad experience with it, or knows someone who has had a bad experience with it, due to the "creativity" of end users.
Of course, as described in the sections above, there are many very legitimate reasons why an IT department may consider a database hosted in a data server such as SQL Server or MySQL to be a better solution than a MS Access file database.
Semantics
This discussion has not compared MS Access to SQL Server or MySQL, it has compared the relative merits of MS Access databases versus databases hosted in a data server such as SQL Server or MySQL.
MS Access itself is both a development environment (for tables, menus, queries, forms and reports) and also a runtime environment in which MS Access applications are run.
SQL Server (or MySQL) on the other hand is first and foremost software that "serves up" data and provides many options related to that (like security).
The line gets a little blurred when we consider that SQL Server also offers SQL Server Reporting Services (SSRS) which provide for the creation and deployment of both desktop and mobile reports.
Questions
Hopefully this discussion will help you to make an informed decision about whether to store your data in a MS Access database or within a database hosted by a data server such as MS SQL Server or MySQL.
If you have any questions or would like to discuss a project please get in touch.