Author Archive
The post is a snippet from a new whitepaper released on the website - “Druvaa inSync - Laptop Backup for Remote Workforce”, please download the original document from - http://www.druvaa.com/products/insync_docs.html
About 200 million employees work remotely, i.e. they are not at their desk while accessing emails, sending work updates or editing financial sheets and about 70% of this working force never plans a backup before starting or during the travel.
With over 600,000 notebooks lost on US airports, this post discusses the key concerns in remote-backup, shortcomings in existing solutions and how inSync addresses them.
Remote Backup Requirements
Some of the key requirements for remote or on-the-move laptop backup are –
- Availability of sufficient bandwidth - Statistics show that on an average a business PC has about 8GB of important corporate data. Almost 80% of this data is usually in form of archived emails. The rate of addition/change of data is about 1% but the actual differential change almost doubles to about 2% (because of the email archival formats and post processing used by most of the email clients like Microsoft Outlook). Transferring this differential data (160 MB) over a WAN connection can be a problem especially when the user wants to spend that limited time accessing and answering the new emails.
- Network access to backup server -While traveling, the users often cannot access enterprise LAN or VPN, which restricts their access to organization’s secure resources. While working over the WAN the user’s PC does not even have a static (publically visible) IP address. This makes it impossible for the backup server to contact the user’s PC and fire a backup.
- Security -Working on the WAN, un-encrypted data backup/restore can expose the corporate information making it vulnerable to eavesdropping or stealing.
Limitations of Existing Solutions
There are two major design limitations in most of the existing solutions inherited from the tape-backup legacy -
- Server triggered backup - Earlier when mobility was not an option, network backups used to be driven by a central backup server. The legacy still continues and even today most network based PC backup systems are still server triggered. The architecture imposes following limitations-
- When the user is traveling, the server initiates and sends network requests out-side the corporate network. Unlike the web or email server, this is a special case and needs more attention.
- The PC must be also visible to the server on a published IP address.
- The majority work is performed by the server and as the enterprise size grows the server scalability becomes a bottleneck.
- Dependence of low latency, high bandwidth network -Thanks to the tape-backup legacy, most network backup systems today use same old R-Sync style checksum algorithms for incremental backups. And because of large no. of network interactions required in these algorithms, the backup systems works better on low-latency networks for less network turn-around time and faster execution.
Druvaa inSync - Client Triggered Backup Architecture
Druvaa inSync is an automated enterprise laptop backup solution which protects corporate data while in office or on-the-move. It features simple backup, point-in-time restore, and patent-pending de-duplication technology to make backups much faster.

Druvaa inSync client is a host based soft driver which gets installed on the user PCs. It is equipped with sufficient backup intelligence to initiate and accomplish backup. Configuring a client is a simple 5 step effort and can be completed within minutes of installation. The client triggered backup architecture enables the client to contact server and initiate backup over LAN/WAN and ensures high levels of scalability and security.
Druvaa inSync Enterprise Server is a software service which runs on a dedicated sever and can scale to serve terabytes of enterprise data. The server accepts backup and restore requests on published IP addresses using a 256-byte SSL encrypted channel and stores it locally on a 256-bit AES encrypted storage.
Data-deduplication - 10X Faster Backups
Druvaa inSync uses patent pending data de-duplication technology called SendUnqiue to remove duplicate data at the source (user’s PC) before the actual data backup is initiated.

The data updates for a user are checked for duplicate content against existing data (from all the users) backed up at the server. The server then requests backup for only newly created unique content and maintains a reference for common content.
November 12th, 2008
This weekend I downloaded and tried the new Druvaa inSync PC backup release v2.2 (news, download, release notes) . In this post, I have reviewed some of the new features as an end user -
1. Admin configured backup folders
The new “admin configurable backup folders” features is targeted specifically at enterprise setups where admin wants better control over what gets backed up on user’s PCs. Now, for each profile the administrator can choose “must have” backup folders, and conditionally allow users to add one or more folders. This helps the admin implement standardized backup policies.

As the screenshot shows, admin can either choose standard folders like “Dekstop”, “My Documents” etc or browse locally to choose standard backup folders e.g. “D:\” or “C:\Important Data”
2. Browser Restore
When not on one’s own PC, accessing the data can be really difficult . With the new browser restore feature, the browser comes very handy to access the backed up files.
To provision the feature, the admin has to first choose published IP address/port for web-restore in server configuration and then enable “web restore” in the profile. The user’s of the profile can then set the password for web-restore in the client configuration (as shown below).

Once password is set, the user can access the backup data over HTTPS using the published web-restore URL. The screenshot below shows web-restore in action -

3. New Server Status and Advanced Reporting
I have always liked products with good reporting capabilities. The ability to show current status/health and good reports/statistics without making the admin go through the logs is very important.
The new release comes with pretty good reporting capabilities. The new server status (shown below) tells the admin that everything is working fine.

And whenever the server faults or encounters a not-so-good situation (like “disk full” or “db not reachable“), alert email notification is sent immediately (see below) -

The new reporting engine is what i liked the most, it now offers six different reports. As shown below, each of these reports can be scheduled independently. The admin can also use the “Report Now” feature for generating on-demand reports.

The screenshot below shows a snippet of “Complete Report” -

4. Remote Restore
At Druvaa, we are paid to restore your data. To make the restore even simpler for some non-tech users, we have empowered the admin to schedule data restore for the user. The admin can now remotely browse user’s data backup hierarchy (but can not view data) and remotely schedule restore for any file/folder . The chosen file/folder is automatically restored to user’s desktop.
The screenshot below shows, how admin can now browse and restore user’s data

5. Configuration API
With this release, Druvaa has also opened up the configuration API (as XML-RPC calls). This is mainly intended for third party vendor’s for better integration with their applications.
The Road Ahead
I must say, I am quite impressed with the new features. But, I guess there is always scope for improvement :). Based on my evaluation, I have put the following feature request, and hopefully you would these in the next major version -
- Server Status - include “uptime” and “reason for last downtime”.
- Cool new reports - Backup data composition (file-types) for every storage, configuration changes
- Search files in Restore - a search box for search on file name would be nice
November 3rd, 2008
The Storage Hunger
Sale of disk-bases storage system has already crossed 2500 Petabytes in 2008 and up by 58.1% YOY (One petabyte = 1 Million Gbs). These figures do not include the direct attached storage which comes pre-loaded with PCs or servers.[1]
This is understandable as 1TB (1000GB) storage NAS/SAN devices are now commodity. The top three vendors in this space are HP, IBM and EMC with market share of aprroximately 29%, 20% and 14% respectively.[2]
The overall consumption doubles when this storage is backed up
Energy Consumption
On an average a dataceter consumes 100 Watts/sq-feet of energy and the best solid state storage consumes about 5 watts for 1MB IOPs.[3]
This puts the total cost for mainiating (cooling + power) for 1 TB disk array about USD $2,500/annually. (16c for KWh, and 20 GB average daily usage).
This makes the annual energy consumption of newly bought storage = USD 5 Billion !!!
And backing this 5 Billion dollar inventory surely adds couple of more billions.
Data De-duplication
The data de-duplication technology saves single copy of duplicate data. There are two important aspects of any data de-duplication solution/product -
- Scope of duplicate discovery - File-level / Sub-File level / Block level
- Point of duplicate discovery - Source / Target
Most of the storage vendors which use data de-duplication provide block-level duplicate removal at target (i.e. when the data reached the storage). But, its not very difficult to image that source level removal of sub-file or block level duplicates would be much better for two reasons -
- Sending lesser/de-duplicated data saves time and bandwidth (apart from storage)
- Duplicate discovey would be much better as you have access to the structured data
Consindering Microsoft’s report on de-duplicate assessment [4], -
- 20-30% data duplicates are easily visible even in unstructured data source like ERP databases
- 40-80% data duplicates can be seen in file-servers and mail servers.
- 60-90% data duplicates can be seen between different PCs. (Just my observation and opinion)
On an average a conservative 30% data duplicate removal can save $1.6B on storage energy and $2B on bandwidth costs and backups.
De-duplication and Druvaa
We see Druvaa inSync as a product/platform to provide de-duplicated (at source) backup for PCs, PDAs and servers. The current version is available for just PCs and we can easily see up to 90% savings for time and cost (bandwidth and storage) for enterprises.
I just don’t see a reason why all storage and backup vendors wouldn’t do it. EMC and Netapp have already announced de-duplcation as additionally licenssible technology on their arrays (target based).[5] No major vendor except for EMC has announced agent/source based de-dup though.[6]
Surely, Druvaa has a good lead and cashing on it 
September 20th, 2008
Some vitals stats I could gather from google/IDG/Gartner around -
- Almost 200M employees work remotely (off their desk)
- Close to 637K laptops are lost on US airports annually
- 65% of users don’t do a backup before they start the travel
- 90% of users don’t backup while on the move
In the last 6 weeks I (personally) have heard same statement from at least 8 VPs or CXOs - “I lost my data/notebook during last travel at …” - and none of them had a backup
If I have to summarize, the top three reasons I heard from these guys:
- My backup software dosen’t work over WAN
- I had limited bandwidth connectivity
- I hate backups - they slow down my PC and work
The case for Druvaa inSync
Well this is the exact market we are focused on. Druvaa inSync does a wonderful job of remote backups because -
- It offers 90% savings in time, bandwidth and storage needed for backups.
- WAN acceleration boosts speed over WAN
- Smart bandwidth prioritization sets only a percentage of bandwidth for backup
- Super Secure - inSync uses SSL encryption over WAN and doesn’t need a VPN for backup.
Druvaa inSync saves single copy of data duplicate between users reducing the avg. backup size to almost 90% . (Considering the fact that corporate data is almost 80% duplicate between users). Yup, it’s truly unique and one of the most selling features of inSync.
Of course now these guys are taking business to me 
August 31st, 2008
For any enterprise, the definition and amount of “critical data” on laptops and desktops is increasing. This is fueled by increasing security concerns, user mobility and cross-geography office expansions. While the expectations have increased, the existing backup solutions haven’t adapted well with these changes.
They still continue to depend upon large computational resources and dedicated and trusted network/media for backups. The reason, I think, is that most of PC backup solutions have been molded out of old server archival products.
In short, the key requirements for an enterprise PC backup should be -
- Simple and Automated
- Non-intrusive - Light weight and resource/power friendly
- Secure and Internet friendly
- WAN and bandwidth optimized
- Support for incremental backup for large files like Outlook PST
- On-demand restore points
Features Explained -
1. Simple and Automated
“Backing up your PC is one of those things, like eating right or changing your oil on time, that everybody knows they’re supposed to do, but too few people actually carry off well…”
Walter Mossberg, The Wall Street Journal
Surprisingly most of the Notebook backup solutions still have calender schedules. IMO, this is prehistoric. The setup should be max 5 steps and schedules as simple as - “Run every 4 hours”.
2. Non-intrusive - Light weight and resource/power friendly
The primary reason employees hate backup is because of the system/network slowdowns caused by the backup which ticks in as soon as the user logs in.
Laptops are replacing desktops in most of the enterprises, but the software still hasn’t evolved. Backups should be resource friendly and optimized low power consumption. Also, simple options like these can make a lot of difference
- Don’t backup when i am on battery
- Consume max 10% of my CPU
- Consume max. 20% of my bandwidth
3. WAN and Bandwidth optimized
Every company has a reasonably good percentage of mobile workforce. And usually this includes the top-tier management (CEO, and likes). With increasing laptop thefts and data risks, backups should be WAN/Internet ready.
The user should be able to choose a bandwidth (something like use 10% of my bandwidth) and the backup solution should just do the job, even over the weakest internet links. This also greatly helps in cross-office backups and backup consolidation efforts.
4. Secure and Internet friendly
Security is very important, specially when you are over WAN/VPN. Most of the backup solutions are Server triggered, making security policies for firewalls and monitoring very diffic ult (every one is afraid, when they see data flowing out of their network).
The backups should be client triggered, so that the server side firewalls just allow and monitor inbound traffic. Also,The solution should be able to securely setup encrypted/authenticated channels to backup. (SSL channels are best, when it comes to WAN/Internet)
5. Support for incremental sync for large files like Outlook PST
With data increasing, and WAN coming into picture it is very important that the backups are incremental in nature and only the changed bits are copied back to the server.
6. On-demand restore (points)
Sending an email to admin to get the data back is surely complete NO, specially when the user may be off-site/traveling. The backup software should facilitate a smart (possibly browser) based remote and secure data restore.
So next time you choose a backup software for your personal or enterprise needs, make sure it has evolved to have the above mentioned features.
And remember - backup more, backup often.
August 23rd, 2008
Read an interesting article at McKinsey Quarterly, which discussed that the storage demand is increasing at a much higher rate than the falling storage pricing.
For backups, both the storage cost and the bandwidth availability are not able to catch up with increasing storage demand.
Take iPhone as example. The amount of storage in iPhone has increased from 4Gb to 16GB, but the media and bandwidth available for backup hasn’t changed much.
The problem is particularly challenging for remote and on-the-go backups. Its an interesting fact (i read somewhere) that almost 200 Million enterprise users are working remotely at any given time. And its a very good possibility that they wouldn’t have backed their data.
This is where Druvaa inSync comes in. The SendUnique technology, ensures that the duplicate data on enterprise devices is backed up just once, giving a clear 90% advantage for bandwidth and storage used for backup
Currently we ship the product for only notebooks, but soon plan to cover every device connected to enterprise network from PDA to Servers.
Any takers ?
July 19th, 2008
The Gartner Report (here) says storage data de-duplication and virtualization are two main technologies driving innovation in storage management software this year. This makes sense, considering the fact that corporate data is increasing at a whooping 60% annual rate. (Microsoft Report says here).
Server Backup
Data is very rarely common between production servers of different types. Its not difficult to imagine that Exchange email server may not have same content as Oracle database server. But data is largely duplicate within file-servers, exchange server and say a bunch of ERP servers (development and test). This duplication creates potential bottlenecks for bandwidth and storage used for backup.
Existing players have offered two solutions to this problem -
- Traditional single-instancing at backup server to filter out common content e.g Microsoft Single Instance Service (in Data center edition). This saves the just storage cost, depending upon at what level to filter commonalities - file / block / byte. A big player in this space is Data-Domain. These solutions don’t have a client component, they just save storage space.
- New innovative solutions like Avamar (now with EMC) and PureDisk (now with Veritas) which try filter content at backup server level before the data goes to the (remote) store. This makes these solutions much better suited for remote-office backups. They save bandwidth and storage.
But, there are two unsolved problems with both these approaches as well ( Which also, explains a poor response for these products in the market )-
- most of the times simple block checksum matching fails to figure out common data, as it may not fall on block boundaries . Eg. if you insert a simple byte in a file, the whole file changes and all the blocks shift. And the block checksum approach fails.
- Checksum calculation is very costly and makes backups CPU exhaustive.
- These approaches are targeting storage cost, not time/bandwidth which is more critical.
PC Backups
The problem is much more complex at PC level, as duplicated data is distributed among users and is as high as 90% in some cases. Emails / documents and similar file formats create large pool of duplicate data between users.

Also, since 50% of PC backup is mainly large email files, this is problem is particularly difficult to solve using simple file based de-duplication techniches used by servers.
Druvaa inSync v2.0 uses a on-wire (distributed) de-duplication technique which senses duplicate data before the backup starts and hences skips it from the backup. This is transparent to the user, all he notices is a 10 times boost in backup speed with over 90% reduction in bandwidth and storage usage.
How it works
This technology creates and maintains a Global “Single Instance” File System at backup server. Each time a user wants to backup a file, the insync clients prepares a file-fingerprint (using linear polynomial based hash) and compares it with the server. After the server sends a response, the backup happens only for the “unique” data within the file.

The (patent pending) advance file-fingerprinting makes it computationally very easy to filter common content like - same paragraphs in different documents, a same CCed email, media rich corporate presentations etc. This cuts down time for backup by 10 times and reduces bandwidth and storage utilization by 90%.
Other Interesting Features
Another good use of the Gobal Single Instance File System is - Continuous Data protection. The user after starting the restore can see how his files changes over time. Which gives him an option to restore point-in-time data from any point in the past. The marketing name for the feature is - “Eternity. Never lose a file. Ever.” A long name, but serves its meaning
Business Opportunities
The same technology/product can be stripped down to backup PDAs and scaled up to backup servers. A good use case would be to reduce time for backup of bunch of related remote servers.
June 15th, 2008
Druvaa announces the availability of Druvaa inSync 2.0 Beta. The idea behind v2.0 is fast and bandwidth/storage efficient backup. The much awaited release, brings four very interesting and unique features -
1. SendUnique - Enterprise wide on-wire data de-duplication. Almost 80% of PC data (emails/docs) within an enterprise is common between users. SendUnique technology fingerprints the user’s backup set to send only one copy of data (emails/docs) common between different users to the backup server. This speeds up backup by almost 10 times and cut bandwidth usage by 90%.
2. Eternity - Never Loose a file. Ever. Timeline based, from-the-past restore. Enables ultimate protection against data loss or virus attacks.
3. NetworkSense - Automatic network sensing and prioritization. Allocates a user defined percentage of bandwidth for backups.
4. TrueSecure - Client triggered secure backups. 256 byte network (SSL) and 256 bit (AES) storage encryption.
You can sign up here for beta evaluation and updates.
The following presentations describes the 2.0 feature set -
May 23rd, 2008
“We are paid to Backup not Restore … “, this was the company slogan at my last job.
And the team really followed it well, seriously. Look at the existing backup solutions, they have tons of options for backup and most of these options are derived for pre-historic tape-based backups. In fact most of the solutions confuse backup with archival. Take a look at following backups options -
- Backup Rule Engines
- Complicated Scheduling and archival times
- Naming a backup or scheme or snapshots
But, there are hardly any options for restore
, besides they hardly even work
I guess restores can be made much better, with options like -
- Self-serve, web based restore.
- Give and option to restore just a part of backup.
- Search a file in Restore.
- Restore based on Timeline. ( Choose a date and restore based on that date. )
- Support compression for faster restores (just like backups).
With Druvaa inSync we tried to address most these issues. Now with upcoming 2.0 we plan to add file-searching and time-line based restores.
April 20th, 2008
We still couldn’t figure out what makes the traditional backups so slow and resource hungry.
But, we benchmarked insync against normal network copy using a HP dual core machine configured for 50 users. Here are the results -
You can download the PDF (117KB) and other related documents from http://www.druvaa.com/download/insync.html
Should the results differ for a larger set of users ? Looks difficult, but would soon try and publish (upcoming) version 2.0 benchmarks with larger set of users.
- Jaspreet
April 18th, 2008
Previous Posts