BOINC Client Communication Infrastructure, Part III

Parsing Responses

When I mentioned earlier about the BOINC XML parser assuming a line-by-line parse what I meant by that the XML responses are relatively flat and is parsed in one pass. This has some interesting ramifications, for instance the relationship between XML elements is determined by what was parsed before it.

For an example of what I mean I’m going to use the uber GUI RPC which dumps the BOINC Daemons entire state and should be called at the beginning of any session. Most of the mini update RPC’s do not contain any relational information so you have to run a quick query against the results to map a workunit to a project for instance.

Here would be the request:



5
5
15

Here is a simplified version of the response:



5
5
13


Alpha Project



alphaapp
Alpha Application



aapp_5.24_windows_intelx86.exe



alphaapp
524

aapp_5.24_windows_intelx86.exe





Test1
alphaapp
524



Test1_0
Test1


Beta Project



betaapp
Beta Application



bapp_5.24_windows_intelx86.exe



betaapp
524

bapp_5.24_windows_intelx86.exe





Test1
betaapp
524



Test1_0
Test1


So in this example you can see that there is no explicit reference to a project in any of the app, app_version, file_info, workunit, and result elements because it is assumed that they all belong to the same project until the next project element is parsed.

This example illustrates what I mean by a relatively flat XML structure. XPath and XQuery would have a hard time trying to return results that are not in a hierarchical in nature.

—– Rom

BOINC Client Communication Infrastructure, Part II

For all intents and purposes the BOINC Manager and BOINC Screensaver are the same type of application as far as the BOINC Daemon is concerned. They both use the BOINC GUI RPC protocol to direct the BOINC Daemon to perform certain tasks. For the purposes of this article I’m going to assume a valid TCP connection is established between the managing application and the BOINC Daemon.

While the protocol is based on XML, there are a few additional rules:

  • No line shall exceed 256 characters in length.
  • Line feeds shall separate all containers and tags.
  • Unicode is not supported. Everything should be MBCS.
  • Parsing of the XML is assumed to be line-by-line. XPath and XQuery queries are not very useful.

While this is valid XML, the BOINC XML parser will choke on it:


12

This is what it would have to look like for the BOINC XML parser to parse it correctly:



1
2

Another important piece of information about the protocol is that both requests and responses are terminated with a control character:


“03”

When sent via the managing application to the BOINC daemon it signifies that the request is complete and the daemon can process the request now. Likewise the managing application should wait for that control character before processing the response. For the remainder of this series of articles assume that all requests and responses to the requests are terminated with the control character.

Authentication

Authentication within BOINC is pretty basic. Like most other authentication systems it consists of a challenge and a response.

To initiate the authentication process send the following request:



5
5
15

The response will be the challenge to the request to authenticate:



5
5
13
1155697308.532692

The value of nonce is to be used as a value to salt the password hash with. It is randomly generated for each request. MD5 is our primary hashing algorithm. If you had the following password:


password

The resulting string to be hashed would look like this:


1155697308.532692password

After MD5 gets done with hashing the challenge and password the result looks like this:


679f1ff0d1c7ed56321c6bc857cdcb43

So now the last part of the authentication process is to tell the BOINC Daemon what the answer to its challenge is. We do this by sending the following request:



5
5
15

679f1ff0d1c7ed56321c6bc857cdcb43

If successful you’ll see a response like this:



5
5
13

A failure will look like this:



5
5
13

Some interesting notes on authentication:

  • All RPCs are protected if the request is coming from a managing application on a remote computer.
  • All RPCs that can change the behavior of the BOINC Daemon are protected even if running on the same machine.
  • RPCs that just query the state of the client are not protected when the managing application is running on the same machine.
  • Currently is the only exception to the above rules.

More interesting tidbits will follow in future articles…

—– Rom

BOINC Client Communication Infrastructure


Image00112



BOINC, like a lot of modern software, is broken up into several components. Each component has a designated purpose.



BOINC Daemon is the heart of the client software package. It handles management requests from any software that understands the BOINC GUI RPC protocol, it handles CPU scheduling for the various science applications, it downloads new tasks from the projects it is attached too, it uploads the results of the completed work back to the project servers.



BOINC Manager communicates with the BOINC daemon using the BOINC GUI RPC protocol to allow the participant to attach to another project or account manager, suspend and resume tasks, suspend and resume projects, suspend and resume BOINC daemon itself, and check out any messages the daemon has generated.



BOINC Screensaver handles the screensaver request from the operating system and notifies the daemon it needs to choose a graphics capable application to act as the screensaver. After the daemon and science application agree on who is supposed to display graphics the science application opens its graphics window and the daemon changes its internal data so that the next time the screensaver asks for an update it finds out that the screensaver window is open and needs to be brought into the foreground.



All communication is handled by polling at regular intervals. Communication between the daemon, manger, and screensaver is through TCP/IP.



Communication between the daemon and the science applications is handled through shared memory.



Shared memory is a limited resource on most systems, on Solaris it is disabled by default, and for Linux I believe it is setup by default as 2MB across all processes. I believe different distros customize this setting when they build their Linux kernel. The daemon allocates 4KB per science application that is running.



Using shared memory for a communication mechanism between the daemon and manager would have meant that the manager would not have been able to remotely connect up to a BOINC daemon on another machine.



Communication between the daemon and the project servers is handled by libCurl using the HTTP protocol. libCurl can handle the encryption layer of HTTP, SOCKS proxies, HTTP proxies, as well as multiple authentication types. Earlier versions of BOINC, before libCurl was being used, couldn’t handle all of those scenarios.




IP connections consist of an IP address and a port number for both the client and the server. So when BOINC Manager starts up it resolves a server name into an IP address using DNS. After the IP address has been resolved BOINC Manager tells the operating system to establish a connection to the target IP address using an operating system assigned source port and a well known target port.



If the manager and daemon are running on the same machine the connection could look like this:


127.0.0.1:10000 127.0.0.1:31416



10000 would be the outbound port, and 31416 is the inbound or listening port.



Connections from the daemon to a project server look the same as the case of BOINC Manager but the computer goes through the Internet like this:





Image00212



IP addresses that begin with 192.168.x.x are considered non-routable addresses and are used by routers that support NAT to increase the available pool of IP addresses on the Internet.



Tomorrow I’ll follow-up with some more detailed information about the BOINC GUI RPC protocol.



—– Rom

BOINC Simple GUI Refresh, Part II

Hot off the presses with the latest and greatest look and feel of the simple GUI:

Image001

We are getting closer to getting these improvements to alpha and beta testing. The guys at World Community Grid are doing a fantastic job.

Hovering over each project icon, which each project can have it’s own icon, displays additional information:

Image002

Clicking on the project icon brings up a list of the project’s web site links that are currently displayed in the advanced GUI as buttons on the left hand side of the screen.

Image003

Some stuff still needs some more fine tuning, like the preferences dialog:

Image004

Ideally the do work between settings and connect to Internet between settings should have both a start and end time instead of just a start time. World Community Grid is in the process of tweaking their skin, so that is why the preferences dialog looks so different from the main window.

I’ll try and address some concerns brought up by this thread on the S@H forums.

1. The Simple GUI is being written in the wxWidget framework. Although it looks like, at least for the first release, we’ll only have it enabled for Windows. There are some potential issues we will need to cook on a bit longer for Linux and Mac with regards to fonts, globalization, and localization which will take a bit longer to address.

2. While the simplified view of the clients activities is getting a face lift, we are going to add some more information to the advanced view:

  • Bring able to hide the task pane.
  • Context menus
  • Property pages for both projects and tasks. Property pages should be able to display LTD/STD information. Stuff like that.

This is getting pretty exciting.

—– Rom

Latest BOINC happenings

It is time to tell us what you think. We are conducting a poll to determine where the hot spots are for what needs to happen with BOINC. We welcome all kinds of feedback, the more people that respond and the better coverage we get, the more we can improve BOINC and help the projects improve their overall experience.

You can find the poll here:
http://boinc.berkeley.edu/poll.php

The results are published here:
http://boinc.berkeley.edu/poll_results.php

I turned on my TV this weekend to catch up on some of my recordings and I found this in my wait recorded queue:
Rosetta Presentation

I have my media center setup to record any of the Computer Science Colloquium from the University of Washington that comes on UWTV. It happens to be David Baker of R@H giving a presentation to the computer science students about how Rosetta works and how they use the results. He even gave BOINC a plug and discussed how R@H was changing how they do things.

We have had some nice press within the last week, here are some of the articles:
Use your computer idle time for a great cause
Putting your computer to work to fight against malaria in Africa
Coming down to Earth

BOINC Application Optimization: The Good, the Bad, and the Ugly, Part II

Here is another little misconception:
CPU Capability detection coming to a BOINC client near you soon

BOINC does not use the processor’s CPUID instruction to determine what instruction sets are supported.

Using CPUID is a good idea if you are an OS, but it is a bad idea if you are an application. There are two parts to the supported instruction set problem, one is the CPU and the other is the OS. If your OS doesn’t support the desired instruction set you are just asking for trouble.

Here is an example of what I mean. Let us take Windows 95 Gold without any patches, and a modern single core processor.

Back when Windows 95 was released MMX and 3DNow was just taking off and SSE wasn’t mainstream. MMX still used the standard 80-bit x87 floating point registers and so the OS really didn’t have to do anything new to support it. A thread context therefore only had to worry about the general purpose registers, floating point stack registers, and debug registers and everything for the thread stayed consistent when the OS changes execution to another thread.

Now enter modern processors, with the introduction of SSE new registers were added to the CPU. Registers XMM0-XMM7. If the OS doesn’t know about those registers it cannot save the registers before moving on to another application. In the worst case scenario you could have data from your favorite DVD interfering with a BOINC Science application since they’ll both be overwriting one another’s register values. E@H processing TV signals and your DVD player displaying E@H data as video artifacts.

It appears Intel and AMD created something known as enhanced protected mode which exposes the additional SSE registers otherwise they stay hidden from applications and the OS if the OS doesn’t initialize itself as enhanced protected mode aware. So if you are attempting to run an SSE application on a CPU/OS combination that doesn’t support it you should get an illegal instruction error or privileged instruction error.

Apparently AMD decided to increase the number of registers available on the AMD64 line from 8 to 16, but I don’t currently know if this is only for 64-bit OS’s or for both 32-bit and 64-bit OS’s. If there isn’t a special safe guard the OS has to follow, things could end up like the DVD player scenario I mentioned above.

So BOINC queries the OS for what instruction sets are supported, if the OS can detect it, it should support it.

On Windows, the function is called IsProcessorFeaturePresent().

On Linux, we currently read /proc/cpuinfo.

On the Mac we’ll probably be using sysctl().

—– Rom

References:
http://en.wikipedia.org/wiki/IA-32