Personal tools
You are here: Home Data SCA SCA Documentation Project
Document Actions

SCA Documentation Project

by rpwagner last modified 2006-08-21 08:49

List of documentation we need for the SCA website. Basically a todo list for Jake.

On Your Own

Derived Data Formats

Both for the documenting the archive, and to help with later stuff, we'll need an idea of what's in the derived files. We'll need it for the simulations, also, but for now let's do this. Here's the types of derived data that will need to be documented:

  • HALO, Cluster Positions (ParaView)
  • PART, Particles (ParaView)
  • PROJ, Projections (SciPy, PNGWriter)
  • XRAY, X-ray Projections (SciPy, PNGWriter)
  • PROF, Radial Profiles (gnuplot)
  • VOL, Extraction (Vista)
As you know, these are folders that are stored under the simulation data. To start off, grab a few of each type from the different catalogs, and put them in a folder to play with. A lot of this is just going to be poking around the files, and writing up what you see. Of course, an email to some of the people who helped put this thing together may save you a ton of work.

For the datasets (i.e., the files in HPSS and the SRB), start with the general stuff:

Typical Contents
Most of dataset are tar files, and just what the hell is in them? Bust open a few of each type of derived dataset and take a look.
Brief (1 sentence) description
Example: "2D arrays of scalar fields summed along an axis."
Longer Description
1 to 2 paragraphs summarizes the format and how the data was created.

For each of the files we can expect in the dataset, we need the following:

File Formats
This is going to be something of a pain. The files are most likely going to come in four flavors: text, binary, HDF4, or HDF5. The first thing you need to write is a script to guess that. DataStar and the TeraGrid have HDF4 and HDF5 utilities, so you can use the process of elimination. Try to use h5ls, then h4ls, then more, etc., until something doesn't die.
File Structure
Once you figure out the format, open up the files and describe what you see. For text files, this may be a description of the columns, for HDF files it will be the datasets and groups in the files.
Units
You know, like CGS, code units, comoving, or Intrilligators. Another pain, but an important one.

Make you life easy--create an example page template in Plone, and reuse it. List the typical contents, and then describe each expected file.

Analysis (Mini) Tutorials

These are going to go on the Plone site, probably under codes, or maybe a new "Cookbook" folder. The idea is pretty simple; for each of the derived data formats, we want a document showing how to get information out of a derived dataset. It doesn't matter what program we use (so long as it's Free software), or how fancy the result is, or if the tutorials are in any way similar. We'll try to use only one or two programs (like ParaView, gnuplot or SciPy), and it would be nice if the tutorials were steps in a bigger picture, but these things aren't necessary.

This will be the start of our tutorials--these will be very useful beyond just this archive. Right now our documentaion focuses mostly on how to get Enzo to spit out data, and not on what to do the results. This will help to change that.

Each tutorial should contain:

Links to Software
Provide links to all of the software used, whether it's something we have or wrote, or an outside package.
Available Code/Scripts
If you or I write something to automate part of the proccess, it needs to be available somewhere.
Step-by-Step Instructions
These things are called cookbooks for a reason. Look at SciPy's Cookbook for some ideas.
Graphical Result
Everybody digs plots and pictures. This will you decide what to do with the data. Figure out something to plot, and make that the goal.
I realize you may not have any idea how to go from the data to a plot, so I'll help with that. Unfortunately, I don't much of an idea of what to plot, so you'll probably have to go to Mike for ideas on that.

Same as above, create an example page template in Plone. That SciPy page is good place to get ideas.

Get Mike's Help

Supporting Publications

The first thing you need from Mike is a list of publications related to the archive. This includes anything that describes it, and any known publications using the contents. When you ask for his help with the rest of the stuff, he's going to point you towards these things anyways, so ask him for them up front. You can find a few of them yourself by googling "simulated cluster archive"; for example, here's a poster.

These are going to be a big help to you, since anything you find in them, you won't have to rewrite.

Archive Description

General content for the archive:

Synopsis
A paragraph for the front page of the website. Probably a simplified and modified version of an abstract from one of the publications.
Full description
The Who, What, When, Where, Why and How of the project. This is going to under the "About" page. The size of this will depend on how many links to external documents we can use. If we can link to some publications, we can trim it down. At a minimum, it should provide users with an understanding of what they're looking at, and how it was created.

Catalogs Descriptions

For each catalog, we need the following:

Title
Just what you think. Take COOL, LCDM, etc., and turn them into Cool Beer and Lambda Cold Dark Matter, respectively.
Brief (1 sentence) description
This will probably be the main sentence of the full description, like: "Catalog of galaxies clusters modelled using a heuristic star creation algorithm." But don't quote me.
Full description
1 to 2 paragraphs (or more) describing the catalog. This is where to put information about the models used for each catalog, and how they differed.

Verbiage

Surf the web for other sites serving up reseach data, like the NVO, ADIL, or anything that says something like "funding for this project was made possible by NASA grant FOOBAR". Just like in sports, it is very important to acknowledge your sponsors. Find some examples, particularly of sites displaying results from NASA funded projects.

Once you have a few examples, Mike should be able to give the info to make a blurb the website. The blurb should only be a sentence, and will almost definitely look like:

Funding for this project was provided by NASA grant FOO.
That sentence will be used on the bottom of every page, and a modified version should appear in the description of the archive.


Powered by Plone CMS, the Open Source Content Management System

This site conforms to the following standards: