Lift, State, and Scaling

Lift is a web framework built on the Scala programming language. Lift takes advantage of many of Scala's features that allows developers to very concisely code secure, scalable, highly interactive web applications. Lift provides a full set of layered abstractions on top of HTTP and HTML from "close to the metal" REST abstractions up to transportation agnostic server push (Comet) support.

Scala compiles to JVM byte-code and is compatible with Java libraries and the Java object model. Lift applications are typically deployed as WAR files in J/EE web containers... Lift apps run in Tomcat, Jetty, Glassfish, etc. just like any other J/EE web applications. Lift apps can generally be monitored and managed just like any Java web app.

Web Applications, Sessions, and State

All web applications are stateful in one way or another. Even a "static" web site is made up of the files that are served... the application's state is defined in those files. The site content may be served out of a database, but the content served does not depend on identity of the user or anything about the HTTP request except the contents of the HTTP request. These contents can include the URI, parameters, and headers. The complete value of the response can be calculated from the request without referencing any resources except the content resources. For the purpose of this discussion, I will refer to these as session-less requests. News sites like the UK Guardian, MSNBC, and others are prototypical examples of this kind of site.

Sessions

Some applications are customized on a user-by-user basis. These applications include the likes of Foursquare and others where many HTTP requests make up a "session" in which the results of previous HTTP requests change the behavior of future HTTP requests. Put in concrete terms, a user can log into a site and for some duration, the responses are specific to that user.

There are many mechanisms for managing sessions, but the most common and secure method is creating a cryptographically unique token (a session id), and putting that token in the Set-Cookie response header such that the browser will present that Cookie in subsequent HTTP requests for a certain period of time. The server-side state is referenced by the Cookie and the state is made available to the web application during the scope of servicing the request and any mutations the web app makes to session state during the request are kept on the server and are available to the application in subsequent requests.

Another available technique for managing state is to serialize application state in the Cookie and deliver it to the browser such that the server is not responsible for managing state across requests. As we've recently discovered, this is a tremendously insecure way to manage application state. Further, for any moderately complex application, the amount of data the needs to be transferred as part of each request and response is huge.

Migratory Sessions

Many web application managers allow for server-managed sessions to migrate across a cluster of web application servers. In some environments such as Ruby on Rails, this is a hard requirement because only one request at a time can be served per process, thus for any moderate traffic site, there must be multiple processes serving pages. There are many strategies for migrating state across processes: storing state on disk, in memcached, in a database (relational or NoSQL), or having some proprietary cluster communications protocol. In any of these scenarios sessions can migrate across the grid of processes serving requests for a given web application. Web applications that support migratory state are often referred to as "stateless" because the session state does not reside in the same process as the web application.

Session Affinity

Some applications require that all requests related to a particular session are routed to the same process and that process keeps session-related content in local memory. In a cluster, there are multiple mechanisms for achieving session affinity... the two most popular being HAProxy and Nginx.

Availability, Scalability, Security, Performance, and User Experience

There are many vectors on which to measure the over-quality of a web applications. Let's take a quick peek at each one.

Availability

Availability of an application is the amount of time it gives a meaningful response to a request. Highly available applications generally span multiple pieces of hardware and often multiple data centers. Highly available applications are also typically available during upgrades of part of the system that makes up the application. Highly available applications have very few single points of failure and those single points of failure are usually deployed on very reliable hardware.

Scalability

A scalable application can, within certain bounds, respond with similar performance to increased load by adding hardware to process more load. No system is infinitely or linearly scalable. However, many systems have grossly disproportionate load demands such that, for example, you can add a lot of web application front-ends to a Rails application before there's enough load on the back-end RDBMS such that scaling is impaired.

Security

The Internet is a dangerous place and no request that is received from the Internet can be trusted. Applications, frameworks, systems and everything else must be designed to be secure and resist attacks. The most common attacks on web application are listed in the OWASP Top Ten.

Performance

Web application performance can be measured on two vectors: response time to a request and system resources required to service the request. These two vectors are inter-dependent

User Experience

The user experience of a web app is an important measure of its quality. User experience can be measured on many different vectors including perceived responsiveness, visual design, interactivity, lack of "hicups", etc. Ultimately, because we're building applications for users, the user experience is very important.

Lift's trade-offs

Given the number and complexity related to the quality of a web application, there are a lot of trade-offs, implicit and explicit, to building a framework that allows developers and business people to deliver a great user experience.

Let's talk for a minute about what Lift is and what it isn't. Lift is a web framework. It provides a set of abstractions over HTTP and HTML such that developers can write excellent web applications. Lift is persistence agnostic. You can use Lift with relational databases, file systems, NoSQL data stores, mule carts, etc. As long as you can materialize an object into the JVM where Lift is running, Lift can make use of that object.

Lift sits on top of the JVM

Lift applications execute in the Java Virtual Machine. The JVM is a very high performance computing system. There are raging debates as to the relative performance of JVM code and native machine code. No matter which benchmarks you look at, the JVM is a very fast performer. Lift apps take advantage of the JVM's performance characteristics. Moderately complex Lift apps that access the database can serve 1,000+ requests per second on quad-core Intel hardware. Even very complex Lift apps that make many back-end calls per request can serve hundreds of requests per second on EC2 large instances.

Lift as proxy

Many web applications, typically REST applications, provide a very thin layer on top of a backing data store. The web application serves a few basic functions to broker between the HTTP request and the backing store. These functions include: request and parameter validation, authentication, parameter unpacking, back-end service request, and translation of response data to wire format (typically XML or JSON). Lift can service these kinds of requests within the scope of a session or without any session at all, depending on application design. For more information on Lift's REST features, see Lift RestHelper. When running these kinds of services, Lift apps can be treated without regard for session affinity.

Lift as HTML generator

Lift has a powerful and secure templating mechanism. All Lift templates are expressed as valid XML and during the rendering process, Lift keeps the page in XML format. Pages rendered via Lift's templating mechanism are generally resistant to cross site scripting attacks and other attacks that insert malicious content in rendered pages. Lift's templating mechanism is designer friendly yet support complex and powerful substitution rules. Further, the rendered page can be evaluated and transformed during the final rendering phase to ensure that all script tags are at the bottom of the page, all CSS tags are at the top, etc. Lift's templating mechanism can be used to serve sessionless requests or serve requests within the context of a session. Further, pages can be marked as not requiring a session, yet will make session state available is the request was made in the context of a container session. Lift page rendering can even be done in parallel such that if there are long off-process components on the page (e.g., advertising servers), those components can be

Sessionless Lift, forms and Ajax

Lift applications can process HTML forms and process Ajax requests even if there's no session associated with the request. Such forms and Ajax requests have to have stable field names and stable URLs, but this is the same requirement as most web frameworks including Struts, Rails, and Django impose on their applications. In such a mode, Lift apps have the similar characteristics to web apps written on tops of Struts, Play, JSF and other popular Java web frameworks.

Lift as Secure, Interactive App Platform

Lift features require session affinity: GUID to function mapping, type-safe SessionVars and Comet. Applications that take advantage of these features need to have requests associated with the JVM that stores the session. I'll discuss the reason for this limitation, the down-side to the limitation, the downside to migratory session, and the benefits of these features.

Application servers that support migratory sessions (sessions that are available to application servers running in multiple address spaces/processes) require a mechanism for transferring the state information between processes. This is typically (with the exception of Terracotta) done by serializing the stored data. Serialization is the process of converting rich data structures into a stream of bytes.

Some of Scala's constructs are hard or impossible to serialize. For example, local variables that are mutated within a closure are promoted from stack variables to heap variables. When those variables are serialized at different times, the application winds up with two references even though the references are logically the same. Lift makes use of many of these constructs (I'll explain why next) and Lift's use of these constructs makes session serialization and migration impossible. It also means that Lift's type-safe SessionVars are not guaranteed to be serialized.

One of the key Lift constructs is to map a cryptographically unique identifier in the browser to a function on the server. Lift uses Scala functions which close over scope, including all of the variables referenced by the function. This means that it's not necessary to expose primary keys to the client when editing a record in the database because the primary key of the record or the record itself is known to the function on the server. This guards against OWASP Vulnerability A4, Insecure Object References as well as Replay Attacks. From the developer's standpoint, writing Lift applications is like writing a VisualBasic application... the developer associates the user action with a function. Lift supplies the plumbing to bridge between the two.

Lift's GUID to function mapping extends to Lift's Ajax support. Associating a button, checkbox, or other HTML element with an Ajax call is literally a single line:

SHtml.ajaxButton(<b>PressMe</b>, () => Alert("You pressed a button at "+Helpers.currentTimeFormatted)

Lift's Ajax support is simple, maintainable, and secure. There's no need to build and maintain routing.

Lift has the most advanced server-push/Comet support of any web framework or any other system currently available. Lift's comet support relies on session affinity. Lift's comet support associates an Actor with a section of screen real estate. A single browser window may have many pieces of screen real estate associated with many of Lift's CometActors. When state changes in the Actor, the state change is pushed to the browser. Lift takes care of multiplexing a single HTTP connection to handle all the comet items on a given page, the versioning of the change deltas (if the HTTP connection is dropped while 3 changes become available, all 3 of those changes are pushed when the next HTTP request is made.) Further, Lift's comet support will work the same way once web sockets are available to the client and server... there will be no application code changes necessary for web sockets support. Lift's comet support requires that the connect is made from the browser back to the same JVM in which the CometActors are resident... the same JVM where the session is located.

The downside to Lift's session affinity requirement mainly falls on the operations team. They must use a session aware load balancer or other mechanism to route incoming requests to the server that the session is associated with. This is easily accomplished with HAProxy and Nginx. Further, if the server running a given session goes down, the information associated with that session is lost (note that any information distributed off-session [into a database, into a cluster of Akka actors, etc.] is preserved.) But, Lift has extended session facilities that support re-creation of session information in the event of session lost. Lift also has heart-beat functionality so that sessions are kept alive as long as a browser page is open to the application, so user inactivity will not result in session timeouts.

Compared to the operational cost of a session aware load balancer, there are many costs associated with migratory sessions. First, there must be a persistence mechanism for those sessions. Memcached is an unreliable mechanism as memcached instances have no more stability than the JVM which hosts the application and being a cache, some sessions may get expired. Putting session data in backing store such as MySQL or Cassandra increases the latency of requests. Further, the costs of serializing state, transmitting the state across the network, storing it, retrieving it, transmitting it across the network, and deserializing it all costs a lot of cycles and bandwidth. When your Lift application scales beyond a single server, beyond 100 requests per second, the costs of migrating state on every request becomes a significant operational issue.

Session serialization can cause session information loss in the case of multiple requests being executed in multiple processes. It's common to have multiple tabs/windows open to the same application. If session data is serialized as a blob and two different requests from the same server are being executed at the same time, the last request to write session data into the store will over-write the prior session data. This is a concurrency problem and can lead to hard to debug issues in production because reproducing this kind of problem is non-trivial and this kind of problem is not expected by developers.

The third issue with migratory sessions and session serialization is that the inability to store complex information in the session (e.g., a function that closes over scope) means that the developer has to write imperative code to serialize session state to implement complex user interactions like multi-screen wizards (which is a 400 line implementation in Lift). These complex, hand written serializations are error prone, can introduce security problems and are non-trivial to maintain.

The operational costs of supporting session affinity are not materially different from the operational costs of providing backing store for migratory sessions. On the other hand, there are many significant downsides to migratory sessions. Let's explore the advantages of Lift's design.

Lift's use of GUIDs associated with functions on the server:

Increase the security of the application by guarding against cross site request forgeries, replay attacks, and insecure object references
Decrease application development and maintenance time and costs
Increase application interactivity, thus a much better user experience
Increase in application richness because of simpler Ajax, multi-page Wizards, and Comet
Improved application performance because fewer cycles are spent serializing and transmitting session information
No difference in scalability... just add more servers to the front end to scale the front end of your application

The positive attributes of Lift's design decisions are evident at Foursquare which handles thousands of requests per second all served by Lift. There are very few sites that have more traffic than Foursquare. They have scaled their web front end successfully and securely with Lift. Other high volume sites including Novell are successfully scaling with Lift.

If you are scaling your site, there are also commercial Lift Cloud manager tools that can help manage clusters of Lift's session requirements.

Conclusion

Lift provides a lot of choices for developing and deploying complex web applications. Lift can operate in a web container like any other Java web framework. If you choose to use certain Lift features and you are deploying across multiple servers, you need to have a session aware load balancer. Even when using Lift's session-affinity dependent features, Lift applications have higher performance, identical availability, identical scalability, better security, and better user experience than applications written with web frameworks such as Ruby on Rails, Struts, and GWT.

Lift, State, and Scaling November 2, 2010