Wednesday, June 17, 2015

CORS : Cross-Origin Resource Sharing

The future of the web is cross-domain, not same-origin. As CORS continues the spirit of the open web by bringing API access to all.

The same-origin policy is an important security concept implemented by web browsers to prevent Javascript code from making requests against a different origin (e.g., different domain) than the one from which it was served.

Cross-Origin Resource Sharing (CORS) is a specification that enables truly open access across domain-boundaries. The spec defines a set of headers that allow the browser and server to communicate about which requests are (and are not) allowed. CORS is a technique for relaxing the same-origin policy, allowing Javascript on a web page to consume a REST API served from a different origin.

Simple implementation:

In the simplest scenario, cross-origin communications starts with a client making a GET, POST, or HEAD request against a resource on the server. In this scenario, the content type of a POST request is limited to application/x-www-form-urlencoded, multipart/form-data, or text/plain. The request includes an Origin header that indicates the origin of the client code.

The server will consider the request's Origin and either allow or disallow the request. If the server allows the request, then it will respond with the requested resource and an Access-Control-Allow-Origin header in the response. This header will indicate to the client which client origins will be allowed to access the resource. Assuming that the Access-Control-Allow-Origin header matches the request's Origin, the browser will allow the request.

On the other hand, if Access-Control-Allow-Origin is missing in the response or if it doesn't match the request's Origin, the browser will disallow the request.

For simple CORS requests, the server only needs to add the following header to its response:

Access-Control-Allow-Origin: *

It differs to implement CORS for specific platforms. Suppose for tomcat a minimal CORS configuration as:

<filter>
<filter-name>CorsFilter</filter-name>
<filter-class>org.apache.catalina.filters.CorsFilter</filter-class>
</filter>
<filter-mapping>
<filter-name>CorsFilter</filter-name>
<url-pattern>/*</url-pattern>
</filter-mapping>

N.B: Bear in mind that CORS is not about providing server-side security. The Origin request header is produced by the browser and the server has no direct means to verify it.

Tuesday, June 16, 2015

Concurrency &. Parallelism

For the case of multi threaded programs we often use these terms, concurrency and parallelism.

Multiple tasks are in progress at the same time refers to concurrency. An application may process one task at at time (sequentially) or work on multiple tasks at the same time (concurrently).

Each task is broken in to sub tasks which can be processed in parallel. It is related to how an application handles each individual task. An application may process the task serially from start to end, or split the task up into subtasks which can be completed in parallel.

An application can be concurrent, but not parallel. This means that it processes more than one task at the same time, but the tasks are not broken down into subtasks.
An application can also be parallel but not concurrent. This means that the application only works on one task at a time, and this task is broken down into subtasks which can be processed in parallel.
Additionally, an application can be neither concurrent nor parallel. This means that it works on only one task at a time, and the task is never broken down into subtasks for parallel execution.
Finally, an application can also be both concurrent and parallel, in that it both works on multiple tasks at the same time, and also breaks each task down into subtasks for parallel execution. However, some of the benefits of concurrency and parallelism may be lost in this scenario, as the CPUs in the computer are already kept reasonably busy with either concurrency or parallelism alone. Combining it may lead to only a small performance gain or even performance loss. We should analyze and measure deeply before we adopt a concurrent parallel model blindly.

Friday, June 12, 2015

Cloud: IaaS, PaaS, SaaS

Cloud computing is the phrase used to describe different scenarios in which computing resource is delivered as a service over a network connection (usually, this is the internet). Cloud computing is therefore a type of computing that relies on sharing a pool of physical and/or virtual resources, rather than deploying local or personal hardware and software.

IaaS:

Infrastructure as a Service (IaaS) is one of the three fundamental service models of cloud computing alongside Platform as a Service (PaaS) and Software as a Service (SaaS). As with all cloud computing services it provides access to computing resource in a virtualised environment, “the Cloud”, across a public connection, usually the internet. In the case of IaaS the computing resource provided is specifically that of virtualised hardware, in other words, computing infrastructure. The definition includes such offerings as virtual server space, network connections, bandwidth, IP addresses and load balancers. Physically, the pool of hardware resource is pulled from a multitude of servers and networks usually distributed across numerous data centers, all of which the cloud provider is responsible for maintaining. The client, on the other hand, is given access to the virtualised components in order to build their own IT platforms.

In common with the other two forms of cloud hosting, IaaS can be utilised by enterprise customers to create cost effective and easily scalable IT solutions where the complexities and expenses of managing the underlying hardware are outsourced to the cloud provider. If the scale of a business customer’s operations fluctuate, or they are looking to expand, they can tap into the cloud resource as and when they need it rather than purchase, install and integrate hardware themselves.

PaaS:

Platform as a Service, often simply referred to as PaaS, is a category of cloud computing that provides a platform and environment to allow developers to build applications and services over the internet. PaaS services are hosted in the cloud and accessed by users simply via their web browser.

Platform as a Service allows users to create software applications using tools supplied by the provider. PaaS services can consist of preconfigured features that customers can subscribe to; they can choose to include the features that meet their requirements while discarding those that do not. Consequently, packages can vary from offering simple point-and-click frameworks where no client side hosting expertise is required to supplying the infrastructure options for advanced development.

The infrastructure and applications are managed for customers and support is available. Services are constantly updated, with existing features upgraded and additional features added. PaaS providers can assist developers from the conception of their original ideas to the creation of applications, and through to testing and deployment. This is all achieved in a managed mechanism.

As with most cloud offerings, PaaS services are generally paid for on a subscription basis with clients ultimately paying just for what they use. Clients also benefit from the economies of scale that arise from the sharing of the underlying physical infrastructure between users, and that results in lower costs.

Below are some of the features that can be included with a PaaS offering:

Operating system
Server-side scripting environment
Database management system
Server Software
Support
Storage
Network access
Tools for design and development
Hosting

Software developers, web developers and businesses can benefit from PaaS. Whether building an application which they are planning to offer over the internet or software to be sold out of the box, software developers may take advantage of a PaaS solution. For example, web developers can use individual PaaS environments at every stage of the process to develop, test and ultimately host their websites. However, businesses that are developing their own internal software can also utilise Platform as a Service, particularly to create distinct ring-fenced development and testing environments.

SaaS?

SaaS, or Software as a Service, describes any cloud service where consumers are able to access software applications over the internet. The applications are hosted in “the cloud” and can be used for a wide range of tasks for both individuals and organisations. Google, Twitter, Facebook and Flickr are all examples of SaaS, with users able to access the services via any internet enabled device. Enterprise users are able to use applications for a range of needs, including accounting and invoicing, tracking sales, planning, performance monitoring and communications (including webmail and instant messaging).

SaaS is often referred to as software-on-demand and utilising it is akin to renting software rather than buying it. With traditional software applications you would purchase the software upfront as a package and then install it onto your computer. The software’s licence may also limit the number of users and/or devices where the software can be deployed. Software as a Service users, however, subscribe to the software rather than purchase it, usually on a monthly basis. Applications are purchased and used online with files saved in the cloud rather than on individual computers.

Thursday, June 11, 2015

Microservices

The industry standard approach for deploying Java EE applications is packing all components into single EAR/WAR archive and deploying the archive on an application server. Although this approach has several advantages, particularly from the ease-of-development perspective, it leads to monolithic architecture, makes applications difficult to maintain, and - particularly important - makes such applications more difficult and sometimes impossible to scale to meet today’s real world demands, especially in PaaS (cloud) environments.

Microservice architecture addresses these shortcomings by decomposing an application into a set of microservices. Each microservice has well-defined functionalities and an interface to communicate with other microservices (such as REST, WSDL, or if needed even RMI). Most often, microservices are stateless.

Instead of packing all microservices into a single archive (EAR/WAR), each microservice is developed and deployed independently of each other. This brings several advantages, such as:

With microservices, applications are more flexible;
Every microservice can be developed independently of other microservices, which simplifies lifecycle and change management, makes it easier to use different technologies or upgrade to newer versions;
Makes it easier to adopt new technologies for parts of an application;
Makes it much more efficient to scale applications in PaaS[Platform As a Service i.e Cloud] and Docker-like environments.

Microservice approach also has its drawbacks, particularly in the added complexity, related to development and particularly to the deployment. To deploy microservices on stand-alone containers requires several steps, such as configuring the containers, defining the dependencies, deploying the microservices, etc.

Resource Link

Wednesday, June 10, 2015

MVC & REST API

MVC was first introduced byTrygve Reenskaug, a Smalltalk developer at the Xerox Palo Alto Research Center in 1979, and helps to decouple data access and business logic from the manner in which it is displayed to the user.

A key difference between a traditional MVC controller and the RESTful web service controller above is the way that the HTTP response body is created. Rather than relying on a view technology to perform server-side rendering of the greeting data to HTML, this RESTful web service controller simply populates and returns a Greeting object. The object data will be written directly to the HTTP response as JSON.

REST is just a set of conventions about how to use HTTP. A "REST API" combines two things: web service and it's RESTful feature. By virtue of being a web service, you get some loose coupling. The client need not be aware of internal implementation details and there is, as you note, a real opportunity for platform/language independence.

Being RESTful offers additional benefits aimed at additional decoupling, so as to allow extreme scalability. REST forbids conversational state, which means we can scale very wide by adding additional server nodes behind a load balancer.

Tuesday, June 9, 2015

Big Data Processing

Big data is a somewhat nebulous term that describes data that can’t be processed by traditional data processing techniques, such as an RDBMS-based application running on a single machine. Big data isn’t necessarily a large volume of data. It can be data that is generated at a high velocity. Big data can also be data that has a lot of variety, such as unstructured data.

Hadoop and Other Tools Apache Hadoop, a framework that allows for the distributed processing of large data sets, is probably the most well known of these tools. Besides providing a powerful MapReduce implementation and a reliable distributed file system—the Hadoop Distributed File System(HDFS)—there is also an ecosystem of big data tools built on top of Hadoop, including the following, to name a few:

■■ Apache HBase is a distributed database for large tables.
■■ Apache Hive is a data warehouse infrastructure that allows ad hoc SQL-like queries over HDFS stored data.
■■ Apache Pig is a high-level platform for creating MapReduce programs.
■■ Apache Mahout is a machine learning and data mining library.
■■ Apache Crunch and Cascading are both frameworks for creating MapReduce pipelines.

Although these tools are powerful, they also add overhead that won’t pay off unless your data set is really big.

There are many configuration tunings you can apply to a Hadoop cluster, and you can always add more nodes if your application is not processing your data as fast as you need it to. However, keep in mind that nothing will have a bigger impact on your big data application’s performance than making your own code faster.

Here is that every microsecond counts. Choose the fastest Java data structures for your problem, use cache when possible, avoid unnecessary object instantiations, use efficient String manipulation methods, and use your Java programming skills to produce the fastest code you can.

Big data technology, such as Apache Hadoop, tackles the problems of volume and velocity by scaling horizontally using fault-tolerant software, which tends to be cheaper and more scalable than the more traditional approach of vertically scaling very reliable hardware. Apache Hadoop deals with variety by using storage formats that support both unstructured and structured data. Machine learning (ML) algorithms are commonly used to process big data.

Resources:

Handoop: http://www.amazon.com/Hadoop-Definitive-Guide-Tom-White/dp/1449311520
Handoop Tuitorials: https://developer.yahoo.com/hadoop/tutorial/

Pages