Alfresco architecture

Posted by Max Dunn Sat, 15 Apr 2006 17:58:00 GMT

Alfresco is a new but promising open source document management system. It is Java based and supports many of the latest standards including the JSR-170 file access API, the JSF tag based interface, the Spring framework, JSR-168 portlets and WebDAV file transfers.

While Alfresco is still young and lacks some functionality, it is well architected and shows a lot of promise. Here is some information about its architecture.

1. Licensed under MPL

The Mozilla Public License (MPL) makes it clear that any changes you make to their code needs to be open sourced. However, any independent files that are created may remain closed, even if they link to the Alfresco code. The following is the from Alfresco Solutions page, third paragraph from the bottom.

Content-enhanced Applications

ISVs and application developers who wish to enhance their applications, such as CRM or ERP systems, can provide content management and search services in their applications with a low-cost, royalty-free, super-fast content management system. The Alfresco system provides a web service interface so that no matter what language your application is implemented in, you can integrate enterprise-level content management capabilities. The Alfresco MPL license means that the license status of your code is not affected.

For more information on the MPL see:

2. Java Based

Alfresco is written in Java and can be run under any J2EE server, but uses only JSP servlets so it can also run under Tomcat. This will allow it to run efficiently under almost server environment. There are distribution packages for Tomcat and JBoss.

Alfresco requires the latest version of Java SE 5 (also known as 1.5). For use under Tomcat, it also requires the latest version Tomcat 5.5.

3. Supported Databases

Alfresco uses Hibernate, which is an object-relation mapping (ORM) persistence and query service. This allows Alfresco to use any JDBC compliant databases including: Oracle, PostgreSQL, MS SQL, MySql, DB2, Firebird, SAP DB, Sybase, etc. However, they seem to favor MySQL 5.0.

While some people worry that using an ORM will slow down their database access, most of the time the opposite is true since ORMs are able to implement certain optimizations much more efficiently than typical handwritten JDBC. Also, since ORMs can significantly reduce the amount of JDBC code, the programmers will have much more time left over to tune the approximately 1% of cases that do benefit from handwritten JDBC.

4. File Storage

While all the metadata, user information, security information and other structured data is stored in the database, all file bodies are stored on the file system. This relieves some burden on the database and makes it smaller, but it does complicate the backup/restore process since the database and files will generally have different backup methods, but need to be kept in sync.

5. Spring Framework

Allows adding and replacing architectural components using configuration rather than coding.

6. JSR-127 JSF

Alfresco uses the MyFaces implementation of Java Server Faces (JSF). JSF is the next generation Java-based web development environment designed to simplify the use of reusable components in a web page and to separate model, view and controller for scalability. JSF has also been described as a tag-based interface for adding user interface capability simply, without programming.

For instance, the navigation path among the screens—including where to go back to—is kept in a configuration file which makes it easy to add new screens and make sure the navigation continues to work

7. JSR-168 Portal Components

Alfresco supports JSR-168 portals that can be used in JBoss Portal, eXo Portal, any other JSR-168 compatible portal. It includes many different pre-packaged portlets including: browsing, space creation, uploading, versions, properties, collaboration, in-line editing, etc.

8. JSR-170

Alfresco supports JSR-170 level 2 which specifies a standard API to access content repositories in Java independently of implementation. The benefit of this is to allow storing, retrieving, searching and other library services across multiple repositories, rather than being locked into one repository.

9. Transfers

Besides the browser interface, the other ways to transferring files into and out of Alfresco are FTP, WebDAV (including DeltaV), CIFS, or custom programming through Web Services

10. Text Search

Alfresco uses Lucene for its search engine which is the leading open source full-text engine. Alfresco has extended Lucene to not only understand the text within a content object, but also its metadata and categories, and allows several repositories to be searched simultaneously.

11. Transformations

Alfresco supports POI (currently just Excel), PDFBox (PDF), and Open Office (all other formats) for transforming files from one format to another, or to provide text for Lucene to index. While this is not as robust or supports as many formats as using Verity filters, or other commercially available text filters, it is all Java based (unlike Verity) so it will run on all supported platforms, and there are no licensing fees (which can be significant with Verity).

12. Enterprise Hold Backs

These features are not available in the open source edition, but only the paid enterprise edition:

  1. Distributed High Availability
  2. Clustering
  3. Group based security
  4. Single-sign through NTLM
  5. LDAP
  6. Compliance ?¢‚?¨‚?? Secure Lifecycle Management

Posted in  | no comments

Comments

(leave url/email »)

   Comment Markup Help Preview comment