Wednesday, November 11, 2009

New Blog Location

The Customer Quality and Success blog has been moved to the Space-Time Research website.

Please update your bookmarks to www.spacetimeresearch.com/blog.html

Sunday, November 1, 2009

Protecting confidentiality - some real life examples

Don McIntosh has come to the party again by contributing a new blog post on how we are enabling our customers to disseminate detailed information while protecting the privacy of individuals. In the context of being providers of Official statistics, making data more available, and making governments more transparent, we show that it *can* be done - you *can* release data.

We are currently engaging with three customers and developing new requirements around the area of privacy protection on their data. For two of the three, the main goal is to deliver more detailed, useful data to their customers without compromising privacy concerns. The other key goals are around reducing the risk of accidentally releasing sensitive data (a goal of increasing importance given the Gov 2.0 fueled demand for more open data), and reducing costs associated with the application of privacy protection. I thought I'd write a short note to summarise our work in this area of late.

We have an API plugin architecture for applying disclosure control. Basically, you can build your own modules that do things like adjust, conceal, and/or annotate cell values based on certain rules, or reject a query if it's deemed too sensitive for whatever reason. You can also record query details and use them to monitor for potential privacy intrusions.

The work we are looking at doing in relation to current customer requests includes the following:

  • Implementing plugins with customised rounding and concealment rules. This is straight forward work as far as our current architecture is concerned, and helps our customers with these requirements to implement rules that maximise the data they can make available. For one customer, we have written a plugin that will suppress numbers less than a certain value, and any related totals. So for example, if you were suppressing all numbers in a table less than or equal to 3, a simple table would show suppression of that cell, plus any totals containing that cell. The example table demonstrates how a returned table would look. By suppressing the totals, you are preventing someone from back-calculating a value that has been suppressed.

  • Allowing custom selection of different rule combinations for testing and more advanced use of disclosure control. This is useful especially where you have a few in-house specialists who are authorised to be more lenient in terms of what rules need to be applied when responding to ad hoc information requests.

  • Extending confidentiality to apply to the output of calculations (SuperSTAR field derivations). For example, you might have a function that in some cases returns "..C" instead of a real value for certain cells as per the example above. Confidentiality can be extended to work with derived data. For example, it would be useful for determining a statistical mean or median and concealing the result if there was less than a certain number of contributors.

We are really keen to hear from our customers and other interested parties. If you have some recent experience in using confidentiality in SuperSTAR or elsewhere, or would like to give us any kind of related feedback, please do feel free to leave a comment or contact us directly.

Wednesday, October 21, 2009

Why APIs are important for Gov2.0

I was at the Gov 2.0 conference in Canberra earlier in the week and found that compared to the talk around social engagement through Twitter and Facebook, the whole concept of open data and APIs took a back seat for much of the event. APIs were mentioned by speakers, but I did not get any sense that the majority of the attendees were thinking about APIs and mash-up-ability of data as much as I do. I also wasn't sure that everyone knew what an API was, or why you would want one.

So we asked our Director of Product Planning, Don McIntosh to write an article about what APIs are, and why they're important. This is what he has to say about APIs.

With social applications, there is a clear and obvious use that everyone can understand, and the staggering traffic volumes for these sites make the topic all the more compelling. But what about open data and APIs? Why should we pay them any attention and how do we benefit from them?

An API is an Application Programming Interface. Web based APIs, sometimes referred to as Web services, are growing at a phenomenal rate. Basically, instead of information being presented in a predetermined manner through Web pages, APIs allow other applications (iPhone apps, Websites, MS Windows applications….) to extract specific chunks of information and combine it with other information in all kinds of ways to serve a specific purpose. Jim Ericson from Information Management blogged about this, and he included a good description of how Web services get used:

Now think of all the thousands of iPhone apps and how they amalgamate all kinds of Web services. You open your commuter traffic app, it calls on traffic information services, Google maps, a weather forecast and maybe an ad for public transportation. One browser app, many (API) calls.

Jim also mentioned how prominent APIs are becoming. For many popular websites, the network traffic generated by APIs actually exceeds the direct Web traffic. And that’s expected to continue. Perhaps even more interesting is the fact that these days, you don’t even need to be a programmer to use Web APIs. If you have played with Yahoo Pipes, or similar mashup tools, you know what I mean. Basically, these tools are empowering end users to create their own custom applications. Just drag and drop – no coding required.

So, they’re useful, widely used, accessible even to non-programming types, and becoming more popular by the day but what in particular makes them so important in a Gov 2.0 context? I’d summarise it by saying that it’s about making it possible (and easy) for those outside of government to present statistics in a context that is meaningful and useful for them, and that can help facilitate informed discussion and decision making. If I want to provide a service to help people decide where to live, I could combine census statistics such as occupation, income, and age and mash it up with information about the location of shopping centres, pubs etc from a different service. I could achieve the same by gathering all the data into a database and building my service on top, but by accessing the data through an API, my information can remain current, and my queries can be run by calls to the API, saving me from the complexities and resources required to process the data myself. I can also leverage other services such as Google maps to present results. And of course, thanks to mashup platforms, this kind of application might just be something that an (non-programmer) individual does to satisfy their own interest. Either way, it makes it much more possible for people to take government information and use it in ways that government may never have chosen to do.

From a data provider’s perspective, there are many things to consider when looking at providing APIs for direct data access and querying.

1. API vs other means

An API can facilitate innovation, and help automate services that other organizations may provide based on the data. It can also provide transparency by not colouring the data in any particular way, but leaving it open to others to render analysis of the data in their own way. On the other hand, if representing the data in certain ways is useful in promoting an organization’s mission, then it might be best to concentrate on delivering the appropriate views and/or viewing tools for the data. Or in some cases, it might make sense to do both.

2. Risk of abuse.

Gartner analyst Andrea diMaio noted that separating data from its source and having no clear way to let consumers understand its lineage or quality runs a great risk of it being misused, or deliberately doctored to represent the “facts” that best suit the application builder. What does this mean to the organization providing the data? Providers of official statistics go to great lengths to defend against this possibility yet by providing data through APIs, they may in some way increase the risk of this happening. Perhaps one way to look at it is to realise that this can happen anyway, without APIs. And it is probably unreasonable to expect a provider to do more than provide accurate quality information alongside their data (and even make it queryable through the API) so that users can make informed choices about what constitutes valid use of the data.

3. Data Privacy Protection

Many statistical agencies have “remote access data laboratory” services to give researchers the ability to perform detailed analyses on their data. There are typically manual checking processes in this, to ensure that researchers’ queries do not breach data privacy laws by identifying individuals from the data (something that is very easy to do, even when data has been anonymized). A provider would need to determine what privacy risks are posed by making the data available through an API, and ensure that appropriate safeguards are put in place.

4. Resources

An API call results in some amount of processing. Depending on the specifics, such as the type of query and the volume of data, the level of computing resources required can be quite significant. In the beginning, one option may be to limit API use to a few specific applications, and expand that over time. Alternatively, the API could impose certain limits for any single user. This is the approach that Twitter uses to manage the enormous demand it generates.

Update: a wordle.net tag cloud of this post


Wordle: Why APIs are important for Gov2.0

Tuesday, October 13, 2009

SuperSTAR Goodies - 6.7 Release progress

We would like to share the progress of some of the good stuff we have been doing in SuperSTAR development towards our 6.7 release.

Since transitioning to a fully agile process, we now run fortnightly iterations. From time to time, we will share the outcomes of an iteration and keep you all up to date.

Some of the key items that came out of this iteration were:
1) Record View in SuperWEB2 - we have implemented our first two user stories:
"As a SW RecordVIEW user, I want a way of seeing all the unit records that relate to a crosstab table so that I can understand the detail behind the crosstabulation".
"As a SW RecordVIEW user, I want filtered view of the unit records that relate to the cells in a crosstab table I choose so that I can focus on specific areas of interest"

We have implemented RecordView using GWT in the RESTful style. GWT allows us to get a Rich Internet client user experience. Using REST means that it is easy for other clients such as SuperView to consume the RecordView service.

2) Aggregated mapping for SuperWEB2
“As a SW2 user, I want to have a faster mapping experience so that I can be more productive”.

The Mapping team have done some great work to improve the performance of our mapping solution in SuperWEB2. They have developed a ArcGISMap widget which allows SuperWEB2 to communicate directly with the Arc GIS Server via a REST interface. This means much faster zoom and pan performance with maps.

3) SuperCROSS Local Annotations Refactor – we are making good progress to get the Annotations working correctly again in SuperCROSS and are on track with our plans.

4) Automated testing – we have also made good progress in automating the testing of SuperCROSS and SuperWEB2.

If you have any questions regarding our progress on the 6.7 release, or about any SuperSTAR product, please do not hesitate to contact us at support@spacetimeresearch.com


Sunday, October 4, 2009

Record VIEW Functionality in SuperWEB2 - comments welcomed


A guest blog from Don McIntosh, our product manager for SuperSTAR. Please feel free to give us comments or feedback so we can incorporate your feedback into our product development while we are developing it.

What I wanted to cover in this post is a brief summary of what we are planning for RecordVIEW, as well as a few features that might come in a later release. I wanted to write about this now while we are developing it so that our customers and partners have an opportunity to comment and hopefully improve on the end result. Another thing we’ll do is provide a link to a test instance to let you play around with it once we have it up and running.

RecordVIEW is a key feature of SuperWEB - and one that is currently lacking in SuperWEB2. It gives users the ability to drill down into the records that contribute to any cell in a table and view other attributes of those records. We find that customers use it for a variety of reasons. Two of the most common reasons are identification of individuals in interesting sub-populations, and data validation. An example of the former is “give me the list of names of all students scored above 95% in the English test”. An interesting point is that almost all the time, the records extracted via RecordVIEW need to be subsequently fed into another system for the user to complete their task. That’s a useful one for us to keep in mind, because perhaps we can add much more value by allowing some kind of direct integration between the RecordVIEW action and other systems.

The first step for RecordVIEW is actually to cover off much of the functionality we had in the original SuperWEB. That means identifying some cells, switching to the RecordVIEW tab, choosing what fields to report on, and then downloading to XLS or CSV. The major addition for the first release in comparison to what was in the original SuperWEB will be in the ease of use. The experience will be a lot more immersive, with fewer pauses for server updates and a richer UI. Click on a cell, chose RecordVIEW and then choose what fields to view. You can choose all fields, or start with none and add a select few. You can also sort the results, and selectively filter what fields you’re interested in viewing. One other key feature I’ll mention is that the results of the RecrordVIEW are transparently paginated, so if you have a very long list, the browser isn’t waiting a long time to update it; it simply adds more as you scroll down.

We are of course very aware that for some datasets, RecordVIEW is not appropriate, due to the sensitive nature of the data. We will keep this simple: if there is confidentiality enabled for a database, then no RecordVIEW. Other permission functionality will remain unchanged from the earlier version.

Other key features we will consider later on include cell selection from other views, such as areas on a chart or map. Also, as I mentioned earlier on, we’d like to explore how we can get RecordVIEW output might be more tightly integrated a workflow that involves taking sets of records to feed into another application for further processing, or viewing in a certain way.

It would be interesting to hear about some usage scenarios of feature ideas for RecordVIEW from our customers. We may be able to incorporate some scenarios in our acceptance testing, and hopefully learn about some ways to make this feature smarter and more in line with users’ core needs.

RecordVIEW will be available the Release 6.5 November service pack.

Thursday, October 1, 2009

We're in the cloud! SuperWEB available now

I'm really excited to announce that we aim to be among the first companies to host applications on the Apps.gov website.

To get there, we needed to get SuperWEB up into the cloud, and this week, we hosted our first application on the Amazon EC2 cloud. Yesterday, I got my first Amazon bill - $10 / day so far and we uploaded a lot of data!

Background:
Vivek Kundra, the US Federal Chief Information Officer, has launched the new Apps.gov Storefront to enable US Federal Government agencies to buy cloud computing services as easily as a consumer can acquire a Gmail or Facebook account.

Cloud computing services reduce costs through reductions in purchasing and maintaining servers, while simultaneously improving service scalabilty to manage peaks and troughs in usage. Kundra says that besides encouraging better collaboration among agencies, he expects cloud services to reduce energy consumption because agencies will be able to share IT infrastructures.

Space-Time Research is responding to the recent US Federal Government request for proposal for applications to be hosted via the Apps.gov website. The Apps.gov Storefront is managed by the US GSA (General Services Administration) and SuperSTAR software is already available for purchase through the GSA e-Library.

Space-Time Research cloud offerings

In September, Space-Time Research initiated a cloud offering by hosting SuperWEBSoftware as a Service (SaaS) on the Amazon EC2 cloud service. SuperWEB is currently in the process of being assessed for inclusion in the Apps.gov website. Once certified, SuperWEB SaaS will be available to buy as a small, medium, large or extra large implementation on a pay-by-month basis.

At the end of October, SuperVIEW will be production-ready and available via a Google App Engine hybrid cloud service.

More about Apps.gov

Apps.gov is managed by the GSA development team, which is led by Casey Coleman, GSA’s CIO. In the article Kundra's great experiment: Government apps 'store front' opens for business, Coleman says:

“Through Apps.gov, GSA can take on more of the procurement processes upfront, helping agencies to better fulfill their missions by implementing solutions more rapidly,”

“We will also work with industry to ensure cloud-based solutions are secure and compliant to increase efficiency by reducing duplication of security processes throughout government."


Tuesday, September 22, 2009

My shortest blog ever

My three favourite sites at the moment:

As we enable public intelligence and data provision, and we're an Australian based company, I have to keep on top of this every day. I love how fast ideas are moving.

For all goodness in quality and testing management. If I ever have a question or problem to solve and I'm stuck, I go here. Good for inspiration and great ideas.

Just launched by the US Government and we're going to be on it soon with a cloud provision of SuperWEB. Any US Government Agency will be able to buy us through this process. Super-excited about this one.

Jo

Thursday, September 17, 2009

Bug Safaris - a different way to find bugs

Here is a post from Adrian Mirabelli - a Customer Quality engineer at STR. The idea for a bug safari came out of a presentation at the ANZ Test Board Conference in March 2009.

Bug safaris at Space-Time Research

For release 6.5, the STR quality team introduced “bug safaris” as a way to effectively and quickly find software bugs.

A bug safari involves the majority of the organisation including development, design, and management to locate bugs. Test cases or scripts are not necessarily provided but guidance should be given. Certain areas are targeted and the amount of interruptions is minimised to increase the effectiveness. Note the bug safari can be held multiple times over a release.

Planning is the key!

At the beginning of a bug safari, the quality manager invites the participants to a planning session or “kickoff”. The purpose of the kickoff is to define:

- The objectives of the session – including to communicate what is being done; all participants should be very clear about this by the conclusion of the session

- Areas to test and who will do it – this is important to ensure coverage and no wasteful duplication

- Configuration required and who will do it

- Test cases or documentation required and who will do it – the structure of these products should be agreed, for example, is it a checklist or a matrix that is filled out on-the-fly?

- Some ideas of how to test – do something unusual or non-typical, test boundary values, do something unusual

- Duration of session

- Method for reporting issues and bugs


Typically the system configuration and documentation will be done by the testing team with the help of technical resources if required. Login information is distributed in advance. The quality manager needs to decide how to report results including submission of bug reports, and therefore plays a crucial role in this testing.

At the agreed date or time, the testing itself is performed, typically no longer than two hours but longer than 45 minutes. This session is generally intense in nature as the mission is to find problems. The system testers are usually assigned to a product and work with the participants to help identify issues and troubleshoot problems. They can also be actively testing the system depending on what is agreed at the kickoff session.

At the conclusion of the session, results are tabulated and any bugs found are raised in the incident tracking system.

Within the next couple of days at the absolute latest, a debriefing is held with the participants including the system testers. The quality manager reports on bugs found, and discussion is held regarding:

- The perceived level of success of the bug safari

- What can be improved for next time

- What worked well this time

- General feelings and sentiments

- Required actions and action owners.


Why not just use structured tests?

Procedural test cases, which follow a step-by-step test script, are excellent for communicating to the wider audience how you are testing and to obtain buy-in and feedback from stakeholders. In my experience, however, you can find bugs by looking around the software, not just looking at the expected results of the test case. Further, bugs are found when testing certain sequences of data, mouse clicks, configuration, operating systems and more, and it is expensive to write test cases for all these combinations.

Why involve people outside the testing team?

You and I are testing software every day. Just by using software you are testing whether it satisfies your need and your purpose. Everyone interacts with software differently, and is likely to try things out in various and different ways, some typical and some strange, so it is good to have such testing sessions to really verify the software is “fit-for-purpose”. It also gives the opportunity for fresh eyes to look and question the software, and test out other important elements such as usability and compatibility. It also increases the participants’ knowledge of the software, whilst testing the accuracy of the configuration and documentation, including the quality of test harnesses and pre-defined scripts.

What are the benefits?

Bug safaris are defined as “exploratory testing” with more tangible results. The results can easily be reported on charts or whiteboards and transferred to the test management and tracking system.

We allow the participants to exercise freedom of thought in executing tests. In this way we can find new bugs as possible new combinations of tests are being exercised. Quality therefore improves as we can address and fix such bugs based on their priority. The participant is encouraged to investigate and should investigate any strange behaviour they find, perform further tests, and ask questions.

By everyone being involved in testing, and not just the test team, it improves the visibility of the test organisation and the importance of testing, whilst sharing the ownership of “quality” to all people involved in the development of the software from concept to implementation.

By performing such tests, we can report and therefore utilise many metrics to find out, for example:

1. Number of bugs or issues found per session

2. Number of sessions run

3. Areas covered in the session with combinations

4. Time required to configure

5. Time required to test

6. Time required to investigate issues

The key is that people work together and discuss openly the software and what it does.

What are the challenges?

This method of testing is still relatively new, and is therefore not a perfect method, nor a substitute for traditional testing methods. The key is to balance out the proportion of how much testing is structured versus unstructured, whilst ensuring that the testing results are captured sufficiently. Such examples of test tracking may require the participant to complete a spreadsheet, matrix or running sheet.

What is being done in future?

STR will run bug safaris in future releases. Bug safaris have been shown to find bugs, and important bugs, and are continuing to win flavour in the testing industry as its true benefits are being realised. Introducing bug safaris have the advantage of not requiring major cultural or system changes, or expensive start-up costs.


Monday, August 24, 2009

Our Quality Vision (and Addressing Our Quality Past)

Like all software companies, we at Space-Time Research have juggled customer demands, complex software, very different uses of our software, and ever changing requirements. This has sometimes resulted in us delivering release software to our customers that is not of a sufficient quality, and later than we planned.

In the past, and as recently as our 6.3 release of our software, our testing group has passed a release and the software has been delivered to a customer and then a critical issue has been found. One of the main reasons this happens is that every customer has a slightly different environment. We currently support Solaris, Red Hat Linux, Windows 64 bit, Windows 32 bit, Windows XP and Vista for our client applications, browsers including IE6, IE7, IE8, Chrome, Firefox, Safari. We read data from any relational database that has a jdbc driver including Oracle, SQL Server, DB2 and others, plus different types of text files. We provide mapping with ESRI ArcIMS, ArcGIS Server, Google Maps and soon Bing Maps. We test all these environments and on our servers, our testing can pass.

Then we get out to the customer environment and encounter different environments & constraints. Not everyone can host a Tomcat application and we might have to hook to IIS. Firewalls might be an issue. Ports might be an issue. The client might operate in a remote way. Even if we don't officially support a configuration, our clients will implement that way anyway and it's up to us to sort it out.

Once we have the software successfully installed and configured at a client site, they then build some databases and work out how they are going to analyse or visualise their information. Every client has different types of databases, structures and uses of their information. Our testing doesn't cover every different type of database - we try to, but of course we don't cover everything. So sometimes we miss things - heirarchical summation options being a recent example.

Finally, our customers use the software with their own workflow. We follow a standard workflow with our automated tests, and then we conduct exploratory testing that mimics what a customer would do, but as we are not the customer, we don't always get that exactly right either.

So, how do we improve it? What have we done and what are we doing next?

Firstly, for our 6.5 General Availability Release, Space-Time Research defined the following quality vision:
  • Timely, relevant, functioning software that works!
  • Performance, stability and resiliency focus.
  • Deliver releases of SuperSTAR that are perceived within STR and by our partners and customers as better than the previous release.
All decisions about testing, and then which bugs we fix, and when we release our software, are related back to the quality vision.

We implemented a partnership approach with some selected customers to enable them to test pre-release versions of our software. We conducted fortnightly builds, ran a couple of days of testing and then made the builds available to the customer. Builds were provided via FTP site, and customers were able to download the software and install in their own test environments. The customers were able to choose whether they would take a build or not. STR also hosted versions of our web applications so customers could do user interface testing without having to run their own installation and configuration.

The customers reported bugs, severity and their own priority via our normal support channel (via email to support@spacetimeresearch.com). We regularly triaged the bugs reported, and communicated via conference call with each customer to advise what we intended to do, or discuss concerns.

The benefits of this approach were clear for each customer involved:
  • Integration and configuration issues were ironed out during the pre-release phase.
  • Customer-focused testing found issues we would never have found.
  • The end delivery held no surprises.
  • We delivered on time to those customers and met their deadlines.
6.5 General Availability release is almost complete on all platforms. I'll do another blog and announcement about that separately.

For our next release, we are implementing a fully agile development process. Another blog on that is coming too! But for our customers, please know that we want to:
  • Involve more customers in pre-release testing.
  • Collect more sample databases from customers.
  • Collect reference data sets from customers so we can validate our statistical routines.
  • Use client test beds for complex or unusual environments.
  • Open up our change management and support processes so customers can track issues they are interested in.
cheerio

Jo


Wednesday, August 19, 2009

Open Data Initiative - Free SuperVIEW hosting of data


Space-Time Research this week launced a new program called the Open Data Initiative at the International Statistical Institute (ISI) 2009 conference in Durban.


What is the Open Data Initiative?
The Open Data Initiative is a Web 2.0 site for disseminating public data. Users discover and explore data in a rich, interactive, and intuitive application, rather than browse or read large documents of published tables and charts. The end user can select and visualize any combination of data. It can be exported, printed, linked to, and shared in collaboration environments.

The Open Data Initiative is a freely available online service for the creation and dissemination of data for public consumption. You have the data; we have the service to disseminate it to the public.

The Open Data Initiative is hosted on the Google AppEngine Cloud, enabling providers of public data to create engagingand rich Web 2.0 experiences built on top of Space‐Time Research’s SuperVIEW product suite. This provides transparent, lightning‐fast web traffic responsiveness, scalability and built in redundancy no matter where in the world you are.

Data types suitable for the Open Data Initiative: Health, Transport, Education, Agriculture, Population Statistics, Labour Force, etc.

How do I sign up?
Contact us via the Open Data Initiative website

Key Benefits. The Open Data Initiative:
  • Is Cost and Time Efficient — Reduces the workload on your data analysts and researchers.
  • Provides Data that is Complete — Why compromise on providing a subset of the data? Maximize the ability of the public to self‐service data of personal interest.
  • Provides Data as Service — Now you can provide a new online data service to the public.
  • Protects the Relevance of Your Brand — Provide an engaging and rewarding experience for the public. This reinforces the relationship of trust they have in your organization.
  • Delivers Data Integrity — Have confidence that the public are seeing the right numbers, graphs, and maps, andreaching the correct interpretation and understanding behind those numbers.
  • Delivers Data Responsiveness — Minimize the time between data collection and data dissemination to ensure maximum relevancy of the data to the audience.
  • Creates Communities of Users — Ensure the online experience can be captured and shared by the public incollaborative environments from Blogs to Twitter.

Frequently asked questions coming from some of our early adopters:

Q. What is the business model for Space-Time Research?
A. This is a free service and as such it has business model restrictions for customers - they cannot charge a fee for access to their created sites. It must be public and not sit behind authentication or payment gateways. We have a paid service available that overcomes these restrictions but this is a good way to test drive the technology and the dissemination approach using the free service initially. Alternatively customers can purchase a paid SuperVIEW software license and implement their own business model around a deployed SuperVIEW.

Q. What about confidentiality?
No confidentiality capabilities are offered with the free SuperVIEW. The Open Data Initiative will host all data in the Cloud so by it's nature data provided should not contain confidential information. We can provide a confidential Cloud based service using our Hybrid connector, but this becomes a paid solution engagement.

Q. How do statistical boundaries get loaded?
We will detail this in the data collection process over the next week with people that sign up to our early adopter program, but think it will be along the lines of providing a shapefile (with some size limits -- i.e. pre-simplified and for particular areas) or KML to us.

Q. How does the application get integrated with the data providers website.
Option 1 -> provide a link that takes the user from the data provider website to the Open Data Initiative website.
Option 2 -> use an IFRAME to embed the Open Data Initiative hosted site into their website.

Friday, August 7, 2009

My favourite Cloud Reading List / Resources

A sampler of general resources I have found useful:

A great site - good definitions, interesting information. Written mostly by vendors. I'm signed up to be a contributor.

I'm not a fan of the word manifesto but this is a group that believes that cloud offerings should be open and compatible with each other. Good definitions as well, and good ideas.

Very interesting articles on this site, but as you will see, the advertising and navigation is annoying. Persevere.

Specfic articles / resources:
I found this when I googled "does google app engine have an SLA". Gold.

Gartner's 7 cloud computing risks article

Another beware of risk article - this time from an Australian Government perspective.

The definitive pricing document for Google App Engine and how the quotas work.

Kundra's continued support of the idea is what makes me think it's all going to happen.

Amazon EC2 links:

Please feel free to add your own.

SuperSTAR and Cloud - nutting through the details of Google Apps Engine

I have spent the past week further working through the details of cloud offerings and how it can be used by existing and new SuperSTAR customers. It's still a hot topic, and I'm still sure it's a really good thing and something that we should be offering our customers if they want it.

So far, Space Time Research has put our SuperVIEW software into the Google App Engine cloud and demonstrated that it works just fine. We're confident that we're addressing many of the security concerns associated with clouds because we've chosen to go with a hybrid model, where the core database and SuperSERVER software still resides in the client's own server environment. It's just the application itself that is in the cloud. The data that is passed to the application is aggregated and encrypted already and something that would be passed to the web browser and then to the user regardless of whether the app is hosted in a cloud or not. So most of the security / data location concerns are not an issue with this model. [Aside: it will be for our SuperWEB product so we need to come up with a different way of handling that.]

At the risk of repeating myself, I do see some clear benefits of using this model over an internally hosted web server:
  • For clients who already have a SuperSTAR infrastructure, external web hosting can sometimes be difficult to arrange. This is an easy and inexpensive way to get around it.
  • Clients can take advantage of the scalability offered and handle peak loads without having to buy massive servers.
  • Of course, there's others. They're in last week's blog.
So now we have a viable demonstration to show customers. The next question I had to answer was - if a customer wants it, would it work for them in a production setting? How scalable is it really? Is it true that there is no SLA for the Google App Engine? And how much would it cost and for what? Here's what I found out:
  • It's true that there is no SLA for the Google App Engine. I reckon this rules out half of our customers straight away. Especially those who are data providers like the Australian Bureau of Statistics and want to reliably provide access to data and analytical tools to the world 24/7. Other customers, such as those who use our software for internal or researcher use, or those who are just starting out with SuperVIEW, might take this risk on board and try it out.
  • It's really difficult to work out what it would actually cost. Everything is costed by usage per day and there's the option of getting a free service that then jumps into a paid service, or a paid service that gets more expensive as your user base grows. What we did work out was that it would be free for most SuperVIEW applications up to 2,000 user sessions per day. After that, it would cost approximately $300 USD per month to add an additional 1,000 users.
Conclusion:
Using the Google App Engine for SuperVIEW is a good way for us to get started with a real cloud offering. We can do this cheaply, and we can pilot and test it over the next 6 months with some of our customers. Space-Time Research needs to provide alternatives for our customers so they can make informed decisions about which method to go with.

We know we have cloud-enabled software and maybe this is enough for now. Our software doesn't work on every cloud (e.g. by definition it won't work on Microsoft Azure because our application is java-based), but it operates in virtual environments and can be hosted securely in some way outside a customer's server environment.

We need to understand more about the potential governmental restrictions - in particular what it means for Australia, the US and the EU which is where most of our customers come from. And what our customers expect us to do for them - do they want us to find the provider, or just provide assurance that our software will work in the cloud of their choosing? There is no point in us going down a path of recommending certain providers and finding out that the government would NEVER choose them.

A whole other topic, and one I will not write about but the gov2.0 thinkers are writing about, is the public servant aversion to risk and change that means that it's possible no-one wants to really do this anytime soon.

Next steps:
  • I am preparing a white paper for our sales team and customers on what we offer now and intend to offer in the future. This will have enough technical details to be able to talk to project owners / sponsors, but IT representatives will need more detail.
  • I'm finding out as much as I can about what governs our customer decisions now. I'm keen to get help on that because it's really hard to find out.
  • I'm talking to Gartner analysts to get their take. Particularly as the article I'm sent most often is one that Gartner wrote about the security concerns of cloud and what to watch out for.
  • I'm talking to a real cloud provider - Telstra - who are an Australian provider and who are going to host all of Visy Recycling's applications which is a significant move.
  • I'm going to ask questions of the Australian gov2.0 taskforce and see what they think about it.
  • I'm going to keep reading the articles I get sent every day.

Reflection:
Even though there is so much information out there, and so many articles and blogs being written about cloud computing at the moment, there is no one-stop answer shop that answers my questions. As I've been searching and figuring things out this week, I've really wished that I could just pick up the phone and call the Google App Engine product manager and talk to a real person. As I write, I realise I could have at least tried. This week's resolution is to ask more questions of real people via whatever means is appropriate.

Tuesday, July 28, 2009

Cloud Computing Services at Space-Time Research

I have been doing a lot of reading about cloud computing and concerns over security of data. In case you hadn't noticed, cloud computing is a hot topic and IT magazines and blogs are overflowing with articles. Kundra is talking about it (Kundra courts the risk of innovation -- Government Computer News ), Gov 2.0 and Data.gov encourage it, and some US city departments are investigating moving all their services into a cloud (L.A. weighs plan to replace computer software with Google service - Los Angeles Times )


At first I wondered what all the fuss was about - it's only third party hosting of applications after all, and it's already been done – A LOT. Over the last few weeks I've delved a bit deeper, and discovered that my understanding of the technology, and options available, was limited. There are a number of different ways applications can be hosted or delivered via a cloud, and putting your application on a separate server housed at an external provider, which is what we do for some of our existing clients, is a very simple but expensive way to do it. I've since discovered there are other ways that might be better.


I have worked at and with large organizations over the last 20 years, and I understand why the idea of moving applications into a cloud is attractive. Sometimes it can be nearly impossible for a business unit within an organization to get a server or space on a server to host applications. And if you can get one, for some organizations, it can cost up to hundreds of thousands of dollars even if the server itself only cost a few thousand. Here we have an opportunity to get rid of one of the major stumbling blocks in putting a new application (particularly a web-application) out there.


The potential benefits of cloud computing are clear:



  • It can be MUCH cheaper. We've worked out that a basic SuperVIEW application could be hosted for under sixty dollars a month (depending on number of users etc.) This compares with an external hosting service cost of $1500 AUD per month for a dedicated server.

  • It removes constraints imposed by IT departments, or even harder to deal with, IT Service Providers. The approvals to host applications on internal servers can be onerous.

  • It can offer scalability to scale up or down, particularly when there is an initial peak load. I’m hoping that when we launch 2011 census data online with the Australian Bureau of Statistics that we can use cloud resources to cope with our initial peak loads.

  • As the hardware and infrastructure are already available, it can be very quick to deploy at application and use it. No more waiting for the server to be ready.


The major considerations are:



  • Some cloud services offerings won't tell you or guarantee where your data is stored and this makes some organizations nervous.

  • The technology and different options available are new and don’t necessarily follow strict government security procedures. I figure that by the time some government organizations are ready to launch an application it will sorted out.

  • Working out your optimal pricing can be a little tricky - it's a bit like a mobile phone plan and if you don't know how your system is going to be used, it can be hard to work out which is the most cost-effective model.


We have recently come up with a couple of cloud offerings for our SuperVIEW software that offer the best of both worlds for SuperSTAR customers. Our customers have given us some direct feedback that they are very interested in cloud models for hosting web applications, but they would like to keep their data in-house. This is not simply an issue of security; all of our customers have substantial data management systems in place, either fully in-house, or connected to privately outsourced data centres. Having the data for SuperVIEW hosted in-house ensures that the provider retains full ownership and does not have to extend its data management policies to address the differences that cloud computing would introduce.


Our HYBRID model fits this bill. The SuperVIEW application is hosted on a cloud provided by the Google App Engine. Via a secure data connector developed by STR, the application connects to a customer's existing SuperSTAR database housed internally. Encrypted, aggregated data is returned to the web application for analysis and visualization in the SuperVIEW web client. Because SuperSTAR databases are read-only, and cannot be manipulated by SQL or other programs, the raw data is secure and is not vulnerable to alteration or attack.


We also want offer the ability to experience the whole SuperSTAR application in a cloud using a different service provider . Currently, we do provide fully hosted dedicated-server solutions, and over the next month we are working out who best to source these services from in a more distributed environment. There are some customers who will always want to keep their data management tools in-house, but others may want to migrate the whole solution to a cloud. We expect to be able to provide a hybrid, or fully cloud-based SuperSTAR service to customers with the next release of our software in the next month or so.


Until next time,


Jo

Contributors