14 April 2019

If the cloud is the computer, who are the cloud developers?

Nobody wants to be in Operations. It doesn't give you all those technology toys so abundant these days. It also means a quite pronounced compensation gap to those having "developer" in their title. These tensions are the origin of a popular HR cliche - "DevOps engineer". But its popularity doesn't come just from the dissatisfaction of operations engineers. And, while it solves the HR pain, it doesn't do the same for the business problem.

The millennium brought a significant change in computing infrastructure for commercial applications. It used to be hand-made, sometimes even by means of physical tools. Now it is API-controlled. Like Ritchie called malloc() in 1970x to get some room for the application data, we call a (cloud) API to get memory and computational power, to interconnect parts of the application system. The cloud is not a bunch of virtual machines, it is a medium running application software systems. The cloud may be a public IaaS like AWS, or on-prem middleware like Kubernetes, or a serverless platform, but that are details. The cloud is the computer for application software. It is a big deal and it does affect professionals working with it.

Computing infrastructure becomes less separated from application code. Evolving from custom-built bare metal servers to application middleware to immutable VMs to serverless and cloud native architectures. The staff working with computing infrastructure is expected to have skills like regular software engineers have. Because software took over that used to be controlled by human management processes.

Deployment of an application system to a computing infrastructure used to be an operation activity. Requiring skills in operations and operational processes. Now deployment topology of an application system is a part of the application design, its implementation is a part of the application code, it is delivered like any other software. Deployment engineers have to have the traits of software developers. They need to know not just how to code, but also how to plan, deliver and ensure quality.

Meanwhile operations of software systems become routine. Much less craft or magic. Humanless or push-button operation became an implicit requirement from business stakeholders willing reduce risks and cost of ownership. Application operations dissolve in the product teams. Eventually operations cease to be a separate activity worth a dedicated department. At the other side, shorter release cycles, Continuous Delivery, DevOps transformation made operations an invaluable source of feedback for application developers, a major pillar for QA process. Former technicians are effectively doing now that is known as "exploratory testing". Toil elimination trend calls for automation. Scale and velocity of application systems make eyeballs inadequate for system behavior analysis. It brings software code doing operation, monitoring, remediation. Done right it becomes a discipline referred to as SRE.

Application development also has to adapt to this "cloud computer". Applications has to be coded for the cloud computing architecture and APIs. One cannot just throw a Linux ELF or a Java JAR binary at a cloud infrastructure. The cloud computer is efficient - if it is used efficiently. Lastly the cloud computer is enormous but not reliable, just due to its scale.

All that stretches responsibility of the existing professional groups (primarily Development and Operations) way beyond a comfortable level. Operations suffer more because the same people works within different (often mutually exclusive) processes.

The old school professional profiles do not work well in the new landscape. They need a transformation. HR rebranding "Operations" to "DevOps" does not help much. Business and engineering problems need more, they need professionals skilled and trained for the new problem domain and processes matching these domains - application development and delivery with cloud computing. These professional roles may be

  • Application developer. Expert in business logic and UX domains. They will learn holistic system view from Operations.
  • Quality assurance engineer. Coming from testers (quality control) group, they need analysts skills and full stack familiarity with the application system.
  • Reliability engineer (SRE). Expert in automation, operations and feedback. Adopt a lot from QA.
  • Cloud developer (let's call them that way). Experts in infrastructure and system operation domains. They are to acquire decades of software delivery body of knowledge.

Advent of cloud computing has forced the change, but it also simplifies it. The "cloud computer" takes a lot of operation and infrastructure development burden. It lets professionals working for the application domain to focus on specific problems.

"Cloud developer" is a new title, but it has roots in the past. It is a system software engineer moved to the era of cloud computing. Same traits - strong knowledge of the underlying infrastructure and familiarity with system operations. Same challenges - long product lifetime, difficulty of changes and high expectations for quality. Though "cloud developer" is not the same as "system software engineer". A system software engineer implements a tiny interface to the underlying hardware in a way that is as application agnostic as possible. While a cloud developer makes a skeleton of an application system. As a skeleton it is not functional by itself but has the shape and major properties of the resulting living product or service. More than any other part of the application body it demands holistic understanding of the habitat, the scale and the behavior.

Obviously a cloud developer is a member of a development team. A team is a group of professionals working towards a common goal following shared processes. What is distinct for development teams:

  • The work is planned. With all the agility, a development work has to be planned or it won't reach its goal.
  • The work creates transferable artifacts. A product of the work is useful by itself, separated from its creators.
  • Work artifacts have quality guarantees built in, provable and enforceable.

Quality is paramount for an application system skeleton because it is difficult to fix or change later. A skeleton needs to absorb without crushing the harm and wear by unplanned changes, shortcuts and defects over the system lifetime. While the quality almost newer improves over time. Luckily, software development industry created plenty of quality assurance tools - design methodologies, feedback cycles, high level languages, cross-validation, various forms of evaluation and testing to name a few.

But why an organization should invest in the change, grow this new professional group? Cloud technologies mature. Business takes them (mainly their promises of efficiency and velocity) for granted. DIY craft by "Jack of all trades but master of none" people rebranded from Operations ceases to match these expectations. Application development for the cloud is a big slice and it is not uniform. It requires complementary skill sets too broad to fit comfortably to a human head. It justifies another professional branch. And provide new opportunities for those sitting in development and operations silos, regardless if they are labeled with "DevOps" in the title or not.

15 May 2016

Our team doesn't use Scrum

We're a small team of experienced engineers developing an infrastructure system for an external customer. We deliberately rejected Scrum as the project management framework and didn't regret. Why?
Infrastructure projects don't bring a direct value (unless it's the primary business). The value has a form of cost saving and comes later when this new facility gets substantial use. Due to delayed utility stakeholders usually don't want to allocate much time and manpower - it means the timeline is under pressure. At the other side infrastructure is a foundational thing, so sacrificing the quality is not an option. So beyond limiting the initial feature set to an essential minimum, the project should be run in a way that
  • is lean (has as low overhead as possible);
  • lets to meet deadlines safely;
  • does not affect product quality or incurs unmanageable technical debt.
Scrum framework mandates fixed-length (1-4 weeks) iterations (Sprints), each iteration having its distinct goal. Technically it means a fixed time-boxed backlog of "User Stories" for an iteration and the goal is to complete the backlog within the iteration. The first issue is that complexity of Stories doesn't care about the iteration length (or capacity - the team ability to do a certain amount of work within an iteration). E.g. we may have an iteration capacity of 10 "units" (story points, man-days or whatever the team uses for estimation) and few Stories of 4 units each. If we pick 2 we plan to waste 2 "units" in the iteration. If we pick 3 - we're planning to fail meeting the iteration goal. It's alignment overhead.
For an infrastructure project Stories take much labour, average time to complete is comparable to an iteration length. Given the alignment overhead is about <time_to_complete_a_Story>/2 * team_size, it's prohibitively high.
Estimation of development tasks isn't exact. Even for a long-running project and an experienced team it comes with an uncertainty. I.e. if a task is estimated for 5 days, there is 50% chance that it will take longer than 5 days (or will be expedited trading quality for time) and in 50% cases it will be completed earlier. In the first case the team fails the Sprint. To avoid that it may add a safety margin to estimations that won't be used in most cases. In the later case (early completion) we get some time but there are no tasks planned for it. This blog post discusses the issue in more details. It's estimation uncertainty overhead.
We know from statistics that the relative error is O(1/sqrt(N_of_tasks)). It comes down with smaller tasks, but for an infrastructure project tasks are long, so the error is high leading to higher safety margin. Roughly we have to waste <estimation_error> + <margin> per each team member.
Standard Sprint has a planning session at the beginning, demo and retrospective - at the end. Practically it means 2 days are lost for productive work. It's planning overhead.
Each Sprint is to deliver customer visible value. Practically it means Sprints are packed with User Stories, bug fixes come then unless they're really breaking. As shown before, at high chance the team has to expedite tasks by cutting corners here and there. Together these trends sink both visible product quality and its sustainability in a long term (that is especially important for infrastructure). Because each Sprint is like the other, nothing stops this degradation. Some teams introduce recurring "stabilisation sprints" to address it. But in our case the customer has to invest long time ahead, so it's difficult to convince him that we need extra time to fix our own kludges. It's bit rot.
OK, an alternative?
That preserves Scrum undoubtful advantages of
  • keeping the development aligned with the the goal;
  • observing and controlling progress to match the project timeline;
  • sustained delivery of user-visible value?
There is no silver bullet. For our case (that's pretty common these days) we use a set of techniques. We call it "Scrumban".
To eliminate operational overhead we got a practice primarily focusing on it - Kanban. It gets rid of alignment and estimation uncertainty costs. It lightens "deadline driven development" syndrome. By explicit limit of work in progress it supports sustained value delivery and keeps team focused. As a side effect it lowers planning burden - it's important because a human brain is quite bad at that activity. Having Kanban boards in most project tracking tools is very helpful - color paper stickers on a whiteboard aren't that fun in reality.
But Kanban doesn't help to keep the development on a right track. Here daily standups and weekly reviews help. Daily standup runs as we used to from Scrum - it starts a day, it's short, it answers 3 questions and it's the main control knob to keep the things rolling in a right direction on a daily basis. Weekly review resembles Retrospective and Planning from Scrum at once, but it's much lighter. At this event the team
  • selects tasks to do next;
  • agrees on implementation plan;
  • refines estimates and adjusts scope.
Plus there is a retrospective review but with no red tape at all. Only things that affect the team progress and are controlled by the team are to be touched. All that together safely fits to one hour - the last working hour on Friday. It closes the organizational overhead for a week.
Also Kanban does a little to follow project timeline. Here we use burndown chart. For the chart to work we need a time span (the whole project, from 1 to 4 months) and a backlog of User Stories. The Stories should be estimated in any units that reflect their complexity. We adopted traditional agile method of comparable tasks.
Then the team need to keep the customer happy and itself - in shape. Sprint Demo does that in Scrum. We use control points, "milestones" set 4-6 weeks apart. Each milestone means a consistent set of features are shown to the customer. But they don't assume there is nothing in progress at that time, so a milestone doesn't incur alignment cost like Sprint does.
Finally - planning. For a customer-requested development we have to deal with fixed price projects. It means fixed time. So we have to estimate it upfront anyway and arrange length and scope with the customer. Practically we get a set of estimated User Stories for a big "iteration" - the project. To make it manageable we should keep these projects within 1 and 3 months (unless it consists of many similar things). As shown earlier, relative estimation error for a longer iteration is less - until it starts to rise when our planning shifts from extrapolation to pure fantasy.
However these techniques don't magically boost the team efficiency. Kanban by itself is not easy. It may be represented as an iterative process where iteration is one day long. It requires much higher level of self-discipline from everybody in the team. Together the workflow is much less regulated. But it means it's denser. And there are no more hours of sitting in a meeting room tossing planning poker cards or watching Jira boards on a wall screen.
I apply this approach for last 5 years refining it over time. It works and is effective given certain prerequisites are met:
  • The team is small, seasoned and co-located.
  • Everybody practice a solid communication discipline - keep teammates informed without disrupting them.
  • The work is expressed in User Stories in its classical sense, not "do that, that and that" style.
  • A long term development is formed from fixed-scope limited-length (1-3 months) compact projects.

19 January 2014

Jenkins performance hints

Well, Jenkins CI server scalability has limitations. But for vast majority of applications its performance is fairy enough. There are installations with hundreds of slaves running about 10k builds daily. While Jenkins configuration is relatively simple, some art required to setup and maintain a busy server. There are some suggestions how to keep it fast - divided to Master configuration, Slave configuration and Job design. Plus few notes on Multi-master Jenkins.

Jenkins Master configuration

Number of plugins. Plugins cause performance issues for builds (because of hooks) and UI (because they adds stuff to it). Do not add too many plugins and anyway - evaluate them thoroughly [ref].

Number of jobs. Jenkins gets slow (at least in UI) with 1000+ jobs [ref]. Moving jobs to several masters (manual static sharding) helps. E.g. one master - for builds, another - for tests. Functional segregation lets to simplify Jenkins configuration and to decrease number of plugins. While splitting a big master to two similar ones leaves two complex configurations on each master.
Keep the number of active jobs reasonable, remove unused.
Utilize Git and Gerrit Trigger plugins to serve multiple branches by one set of jobs.

Jobs on Master. Should be none. Only light internal tasks crucial for Jenkins housekeeping. Definitely - no application jobs.

SCM polling on Master. SCM polling for Git or Perforce require execution of the CLI program for each check for each job. For reliable polling it should be configured to run on master. Polling on slaves (default for both VCS) is bad because slaves are trashable.
Use push hooks instead of polling. For Git use Gerrit trigger - “Ref update” event can replace SCM polling in most cases. For Perforce … set polling period to something large, use “H” or “@hourly” for Cron expression in polling configuration.
Subversion uses SVNkit instead of CLI so it is not affected.

Builds lazy-loading. When JVM minimum and maximum heap sizes differ, WeakReferences (lazy loading uses them) garbage-collected before JVM tries to expand the heap [ref]. It causes extra load on builds re-loading and sometimes may lead to disappearing build records.
JVM configuration for servers should have minimum and maximum heap sizes set to the same value.

Access control. Authenticated users should be allowed to do anything excluding system administration [ref]. “Trust users not to be malicious. Don’t trust users not to do daft things - or read documentation, or to have well behaved unit tests.” [ref] Trust encourages. But also it helps to save on authorization. Complex authorization (e.g. Role Strategy plugin) kills UI performance, API performance suffers too.

Disk IO performance. Use fast disks for configuration (startup time) and build records (build lazy loading) [ref]. SSD on master helps a lot [ref]. Separate configuration, builds records and artifact storage. Worth to look at Pluggable artifact transfer and storage (JENKINS-17236).

Use external API/UI frontend for Jenkins. Jenkins is not very good at UI performance. UI plugins worsen it even more. Workarounds - external UI dashboards or frontend systems [ref]. Examples of problematic plugins:

  • Dashboard view plugin is having a real problem with lazy-loading (it thought being fixed though) [ref].
  • Nested Views plugin causes permission re-evaluation for each job on the server several times. Using regexp to filter jobs makes it worse. Worth to try - use explicit lists of jobs, not regexps. Replace it with Cloudbees Folders Plugin - it might help but needs evaluation.

HTTP cache. Fast HTTP proxy in front of Jenkins to cache static data [ref] might help. But it requires further evaluation.

Servlet container. Embedded Winstone (before 1.535) or Jetty8 (1.535+, but not in 1.532.1 LTS) vs Tomcat. Jetty used to be better on consistent throughput and resource consumption than Tomcat. But for recent Jetty 8-9 and Tomcat 7 there are no clear evidence of it.

Jenkins Slave configuration

Number of slaves. There is “X1K initiative” - goal for Jenkins developers to assure smooth operation of master with 1000 executors on all slaves [ref]. It is still a challenge. Somewhere around 250 slaves and lots of builds slave connections start getting broken in the middle of a build [ref], there are evidences of Jenkins tending to lose connection to slaves when there are about hundred of slaves [ref]. Since thread usage improvement in Jenkins remoting in Jenkins core 1.521 and SSH Slaves plugin 0.27 it should not be an issue [ref, ref], but it is not proven yet.

Number of executors per slave. Increasing number of executors over the slave capabilities decreases overall throughput - due to clashes, IO congestion or RAM swapping. Leverage RAM, CPU cores and build type. RAM should be enough for maximum number of builds at maximum memory setting + file cache. CPU should be enough to work below 100% utilization, taking IO into account - IO releases some CPU time. Have less than 1 executor per CPU core for single-thread builds. Consider IOPS limit - to avoid disk IO being a bottleneck. Generally if 15 min Load Average more than the number of cores, the number of  executors should be decreased. There is a suggestion - 1 executor per slave for isolation [ref]. It is reasonable in cloud but for dedicated hardware the same isolation can be achieved by lightweight containers.

Job design

Workspace cleanup - removing job workspace before build start to get a clean build or after it - to save disk space. It adds time for fresh checkout and even longer - for Maven to download dependencies. Finally build may run few times longer.
Address it in the build system - have a reliable “clean” target in the build script, do not create files outside of temporary build directories, never touch files under version control. Clean up workspaces periodically to be sure. Do it always for “release” builds when the build speed not as important as build sanity.

Artifact fingerprinting. Large fingerprint database may kill Jenkins master performance. Copy Artifact plugin always check fingerprints. Maven builds record artifacts fingerprints unconditionally.
So - prevent code review (Gerrit) builds recording Fingerprints for Maven2/3 builds [ref], maybe - by disabling Maven artifact archiving. Applies to freestyle builds too, but it is controllable there.

Post-build actions. Limit post-build steps, they serialise parallel build (JENKINS-9913). Move the work to to build steps. E.g. use custom artifact archiver (as a build step) such as “mvn deploy”.

Maven jobs vs Freestyle jobs. Use Freestyle - Maven jobs are notably slower. And has its own set of bugs. Even Maven job type inself considered bad by a core Jenkins contributor [ref].

Large build log. Build log is loaded to master memory causing OoM error if the log is too big. Use Log File Size Checker plugin to fail the job if console log reaches a limit.

Sonar analysis. Sonar analysis at each build makes it longer 2-3 times while adds little value - Sonar is a monitoring and code inspection tool, not a gatekeeper. Run it nightly, do not - in each build.

Reference repository for Git SCM. Git repository on the local file system can be used as a reference - only update is downloaded, the rest is hardlinked.

Multi master?

There are no multi-master Jenkins clusters. And it is not expected in a foreseen future [ref, ref]. The only way to share load between masters without custom software - setup 2 masters each for its set of jobs.

  • Jenkins Enterprise by Cloudbees - just for fault tolerance [ref]. It is “active - spare” cluster. No load balancing.
  • Jenkins Operations Center by Cloudbees - simplifies management of multiple masters and slaves. Does not provide multi-master instance with single point of entry. [ref]
  • Openstack/HP multi-master uses custom software (Zuul + Gearman) and specific standardized workflow over it [ref]. It does not use Jenkins UI, only provides direct link to builds in Zuul or Gerrit. Build history, analytics and trends are collected via an external search engine [ref].

General & Cultural tips

Follow Keep it simple, stupid and You aren't gonna need it principles.

References

  1. “Keynotes”. Kohsuke Kawaguchi, Cloudbees. Jenkins User Conference 2013 - Palo Alto.
    Slides: http://www.cloudbees.com/sites/default/files/juc/juc2013/2013-1023-JUC-PaloAlto-Kohsuke-Keynote.pptx
    Video: http://www.youtube.com/watch?v=FaMoiVpKUvQ
  2. “Multiple Jenkins Master Support” Khai Do, Hewlett Packard. Jenkins User Conference 2013 - Palo Alto.
    Slides: http://docs.openstack.org/infra/publications/gearman-plugin/
    Video: http://www.youtube.com/watch?v=pLQddm85fPQ
  3. “Maintaining Huge Jenkins Clusters - Have We Reached the Limit of Jenkins?” Robert Sandell, Sony Mobile Communications. Jenkins User Conference 2013 - Palo Alto.
    Slides: http://www.cloudbees.com/sites/default/files/juc/juc2013/2013-1023-Palo-Alto-Robert_Sandell-Maintaining-Huge-Jenkins-Clusters.pdf
    Video: http://www.youtube.com/watch?v=LRonDiXUx1U
  4. "To Infinity & Beyond the Small Team" James Nord, Cisco
    Slides: http://www.cloudbees.com/sites/default/files/JUC_Palo_Alto_2013_TIaBTST.pdf
    Video: http://www.youtube.com/watch?v=CGjgS16dVUc
  5. “Scaling Jenkins Horizontally with Jenkins Operations Center by Cloudbees”. Cloudbees blog: http://blog.cloudbees.com/2013/12/scaling-jenkins-horizontally-with.html
  6. “Jenkins at Three Years: Becomes Literate, Does Mobile in the Cloud and Handles Multi-Branch”. Harpreet Singh & Kohsuke Kawaguchi, CloudBees. Jenkins User Conference 2013 - Palo Alto.
    Slides: http://www.slideshare.net/kohsuke/jenkins-user-conference-2013-literate-multibranch-mobile-and-more
    Video: http://www.youtube.com/watch?v=AKcQuOROFlI
  7. “Jenkins Scalability Summit notes”. Jenkins Scalability Summit, Oct 2013 - Los Altos. https://docs.google.com/document/d/1GqkWPnp-bvuObGlSe7t3k76ZOD2a8Z2M1avggWoYKEs/edit#
  8. “Kohsuke with OSS hat / Core improvements”. Jenkins Scalability Summit, Oct 2013 - Los Altos.
    Slides: https://wiki.jenkins-ci.org/download/attachments/68747344/Kohsuke.pptx
  9. “Sony Mobile list to Santa Claus”. Robert Sandell, Sony Mobile. Jenkins Scalability Summit, Oct 2013 - Los Altos.
    Slides: https://wiki.jenkins-ci.org/download/attachments/68747344/Sony+Mobile.pptx
  10. “Reducing the # of threads in Jenkins: SSH slaves”. Kohsuke Kawaguchi, Cloudbees. Jenkins CI blog: http://jenkins-ci.org/content/reducing-threads-jenkins-ssh-slaves
  11. “High availability”. Jenkins Enterprise: http://www.cloudbees.com/jenkins-enterprise-cloudbees-features-high-availability-plugin.cb
  12. “Jenkins' Maven job type considered evil”. Stephen Connolly. Stephen's Java Adventures. http://javaadventure.blogspot.ru/2013/11/jenkins-maven-job-type-considered-evil.html

14 January 2014

Big Jenkins servers of 2013

Jenkins CI server used to scale reasonably well for 2000s. That does not hold anymore. Because Jenkins was not designed initially for decade of development and for huge installations with hundreds of servers. So experience of large installations is helpful to realize Jenkins abilities.

Parameters of few large Jenkins installations are published:

Data

Details:

  • Openstack / HP uses Gearman server + 2 Jenkins masters + 300 dynamic slaves handle 10k builds daily. While one master could not keep it.
  • Sony Mobile has 7 independent Jenkins servers, the largest one is "Jenkins Regular @ SELD" in Lundt. One master (24 cores, 64GB RAM, 6TB disk) with 300 slaves (2..4HT cores, about 8GB RAM each) handle 6k builds daily. Sony Mobile configures 1 executor per slave.
  • Yahoo Advertising Platform team has primary Jenkins master (12 HT cores, 96GB RAM, 1.2TB disk + 20TB networked storage for jobs and builds data) and 3 backup master in 2 data centers, 50 slaves in 3 data centers. It perform 8k builds per day producing 6 TB of data.
  • Netfilx has their builders in Amazon cloud. 6 independent masters (AWS m2.2xlarge - 4 cores, 32GB RAM each) and 100 slaves (AWS m1.xlarge) run 4000 builds daily generating 3TB build data.

Links

  1. "Maintaining Huge Jenkins Clusters - Have We Reached the Limit of Jenkins?" Robert Sandell, Sony Mobile Communications. Jenkins User Conference 2013 - Palo Alto.
    Slides: http://www.cloudbees.com/sites/default/files/juc/juc2013/2013-1023-Palo-Alto-Robert_Sandell-Maintaining-Huge-Jenkins-Clusters.pdf
    Video: http://www.youtube.com/watch?v=LRonDiXUx1U
  2. "Multiple Jenkins Master Support" Khai Do, Hewlett Packard. Jenkins User Conference 2013 - Palo Alto.
    Video: http://www.youtube.com/watch?v=pLQddm85fPQ
    Slides:http://docs.openstack.org/infra/publications/gearman-plugin/
  3. "To Infinity & Beyond the Small Team" James Nord, Cisco
    Slides: http://www.cloudbees.com/sites/default/files/JUC_Palo_Alto_2013_TIaBTST.pdf
    Video: http://www.youtube.com/watch?v=CGjgS16dVUc
  4. Jenkins Scalability Summit notes. Jenkins Scalability Summit, Oct 2013 - Los Altos.https://docs.google.com/document/d/1GqkWPnp-bvuObGlSe7t3k76ZOD2a8Z2M1avggWoYKEs/edit#
  5. "13,000 jobs and counting". Mujibur Wahab, Yahoo!. Jenkins Scalability Summit, Oct 2013 - Los Altos.
    Slides: https://wiki.jenkins-ci.org/download/attachments/68747344/Yahoo.pptx
  6. "How Jenkins Builds the Netflix Global Streaming Service". Gareth Bowles, Brian Moyles, Netflix. Jenkins User Conference 2012 - San Francisco.
    Slides: http://www.cloudbees.com/sites/default/files/juc2011/JUCSF_2012_Building-Netflix-Streaming-with-Jenkins_JUC.pdf
    Video: http://www.youtube.com/watch?v=GF0p7jTf6tk

22 August 2013

Fleeing from Catch Notes

Catch Notes terminates on August 30, 2013. Catch (ThreeBanana earlier) was the best note taking app to (sic) take notes. I.e. it doesn’t have all that colorful UI clutter requiring N taps for a simple action. It mostly serves (text) note taking with light organization and sharing features. It has mobile apps, cloud storage to sync and web UI for desktops. Where to flee now?

Simplenote seems to be an appealing cloud text synchronization service with an open API. But no satisfactory alive Android apps. And Google Trends clearly predicts it goes to the same end as Catch.

OK, what’s else? The remaining most popular note taking apps for Android are ColorNote, Evernote and InkPad. Evernote is overloaded and pricy, it's just completely another kind of apps. But the other two look light. Both have cloud sync but it triggered manually. No sharing and

  • ColorNote can import notes from Catch but has no desktop or web client (it’s just planned) and no good way to organize notes. Though the Android app is convenient, has wiki-links between notes and ability to archive notes.
  • InkPad has no organize tools too but has a web UI. Android app UX is a bit tough.

OK, what’s next? Google Keep, MobisleNotes, Microsoft OneNote and Springpad. OneNote and Springpad are by no way easy to use. Moreover OneNote brings SkyDrive - way too much for just a note taking app.

  • Keep - are you imagine keeping several hundred notes there?
  • MobisleNotes - has folders, supports collaboration on notes, but the app was not updated for year now and developers “planning the future of the service” since April 2013.
  • Springpad - is it a note-taking app or a Pinterest prototype? Last time I gave it a try it clearly lost to Catch.

Last tier worth to consider - apps with 500,000+ installations on Google Play. GNotes and Simple Notepad by mightyfrog.

  • GNotes uses a bit exotic cloud storage - GMail. Thus it relies on Google’s will which may change.
  • Simple Notepad is like InkPad minus web UI - it uses Dropbox for sync.

The remaining apps in Play Market are not worth to consider taking their user base in account.

Finally. Either ColorNote or Evernote. Both have utilities to import from Catch. Both are harsh compromises.

19 July 2013

Virualization cost for build automation

It is a build time trend chart from a Jenkins job building a Maven project. Around build # 480 the job was switched from a bare metal builder to a same size virtual machine. Average build time increased for ~25%.

The job was running on a dedicated Jenkins slave machine running CentOS 6 Linux. Earlier it had an i7 4 core + HT CPU with 8 GB RAM and software RAID0 over 2 rotating disks. Then it moved to a Xen 4 virtual machine (paravirtualized) with the same characteristics, only CPU logical cores number was set to 7 instead of 8. No other load was put on this physical box. So the performance drop is due to virtualization only.

Build time change was similar for the other jobs. Moreover C# builds running on Windows Server 2008 boxes shown the same figures. It means
migration from a physical machine to a Xen VM costs about 25% build time increase.
Given various comparative benchmarks of different hypervisors I don't expect better results from the other virtualization technologies. Of course things like Linux cgroups don't count. Another important note - we use local disks. For networked storage things are completely different.

24 February 2013

Keep cloud data private - mission possible

Everybody has some digitized data - texts, and diaries, address books, photos, videos. Most of it is not intended for share. Some of it must be kept quite secret. These things used to live on a desktop computer at home. Unfortunately desktop box cannot be put into a pocket and brought with me everywhere. So naturally data moves to a place which is accessible most of the time - to cloud.

Recent years the amount of personal data in cloud grows explosively. Uncertainty with it grows too. Thats for reason. Our mail, documents, photos, updates etc are not in an abstract neutral “cloud”. All that data is stored on very physical disks owned very directly by a company controlled by a limited group of people. So it is quite possible to realize once morning that all your stuff is disclosed, sold or just dropped. It may sound unreal but last year changes in Facebook and Instagram policies, growing number of data disclosure requests to Google indicating an opposite.

There is a growing desire to keep personal data on personally controlled media. Unfortunately it is not realistic. Yes, most of people get more networked devices. But most of those devices - smartphones, tablets, laptops, - are not permanently interlinked, have limited capabilities, and cannot be considered as reliable storages. At the same time storage devices become exponentially larger and cheaper, and there are no signs of saturation. Thus privately owned devices cannot compete with large networked storage, i.e. with cloud.

The obvious solution is to keep private data in the Net but distributed (geographically and administratively to lower risks of data loss) and encrypted (to minimize risks of disclosure). This model is utilized in certain emerging p2p storage networks. But those networks rely on permanently connected computers - desktops or servers, nowadays such beasts become rare.

It worth to note than such kind of problem is not new. Communication networks - snail mail, telephone, Internet, - faced the similar requirements: they must be global, reliable and cheap to use. None of them belongs to a single organization, they consist of numerous independent service providers. Using this analogy Google and Facebook are similar to UPS and DHL - great but specific services. The common storage fabric should be formed from a lot of companies providing the same service and interoperate on a standard ground. At a certain time such service may become as common as cellular networks or broadband Internet connection.

Of course it won’t work unless it will be profitable for the service providers. Again the communication networks model may be adopted - service endpoints collect their money from end users and then it gets distributed among all participants. Modern communication networks operators will ride the trend or data giants will overtake and acquire telecom business - it may succeed in both ways.

Last but not least the data service must be extremely convenient to use. Fortunately it is mostly a set of technical problems. E.g. the most obvious issue - assure authorised access and encryption for a user using different devices, - can be solved using security tokens or biometric technology.