SCA Documentation Project
List of documentation we need for the SCA website. Basically a todo list for Jake.
On Your Own
Derived Data Formats
Both for the documenting the archive, and to help with later stuff, we'll need an idea of what's in the derived files. We'll need it for the simulations, also, but for now let's do this. Here's the types of derived data that will need to be documented:
- HALO, Cluster Positions (ParaView)
- PART, Particles (ParaView)
- PROJ, Projections (SciPy, PNGWriter)
- XRAY, X-ray Projections (SciPy, PNGWriter)
- PROF, Radial Profiles (gnuplot)
- VOL, Extraction (Vista)
For the datasets (i.e., the files in HPSS and the SRB), start with the general stuff:
- Typical Contents
- Most of dataset are
tarfiles, and just what the hell is in them? Bust open a few of each type of derived dataset and take a look. - Brief (1 sentence) description
- Example: "2D arrays of scalar fields summed along an axis."
- Longer Description
- 1 to 2 paragraphs summarizes the format and how the data was created.
For each of the files we can expect in the dataset, we need the following:
- File Formats
- This is going to be something of a pain. The files are most likely going to come in four flavors: text, binary, HDF4, or HDF5. The first thing you need to write is a script to guess that. DataStar and the TeraGrid have HDF4 and HDF5 utilities, so you can use the process of elimination. Try to use
h5ls, thenh4ls, thenmore, etc., until something doesn't die. - File Structure
- Once you figure out the format, open up the files and describe what you see. For text files, this may be a description of the columns, for HDF files it will be the datasets and groups in the files.
- Units
- You know, like CGS, code units, comoving, or Intrilligators. Another pain, but an important one.
Make you life easy--create an example page template in Plone, and reuse it. List the typical contents, and then describe each expected file.
Analysis (Mini) Tutorials
These are going to go on the Plone site, probably under codes, or maybe a new "Cookbook" folder. The idea is pretty simple; for each of the derived data formats, we want a document showing how to get information out of a derived dataset. It doesn't matter what program we use (so long as it's Free software), or how fancy the result is, or if the tutorials are in any way similar. We'll try to use only one or two programs (like ParaView, gnuplot or SciPy), and it would be nice if the tutorials were steps in a bigger picture, but these things aren't necessary.
This will be the start of our tutorials--these will be very useful beyond just this archive. Right now our documentaion focuses mostly on how to get Enzo to spit out data, and not on what to do the results. This will help to change that.
Each tutorial should contain:
- Links to Software
- Provide links to all of the software used, whether it's something we have or wrote, or an outside package.
- Available Code/Scripts
- If you or I write something to automate part of the proccess, it needs to be available somewhere.
- Step-by-Step Instructions
- These things are called cookbooks for a reason. Look at SciPy's Cookbook for some ideas.
- Graphical Result
- Everybody digs plots and pictures. This will you decide what to do with the data. Figure out something to plot, and make that the goal.
Same as above, create an example page template in Plone. That SciPy page is good place to get ideas.
Get Mike's Help
Supporting Publications
The first thing you need from Mike is a list of publications related to the archive. This includes anything that describes it, and any known publications using the contents. When you ask for his help with the rest of the stuff, he's going to point you towards these things anyways, so ask him for them up front. You can find a few of them yourself by googling "simulated cluster archive"; for example, here's a poster.
These are going to be a big help to you, since anything you find in them, you won't have to rewrite.
Archive Description
General content for the archive:
- Synopsis
- A paragraph for the front page of the website. Probably a simplified and modified version of an abstract from one of the publications.
- Full description
- The Who, What, When, Where, Why and How of the project. This is going to under the "About" page. The size of this will depend on how many links to external documents we can use. If we can link to some publications, we can trim it down. At a minimum, it should provide users with an understanding of what they're looking at, and how it was created.
Catalogs Descriptions
For each catalog, we need the following:
- Title
- Just what you think. Take COOL, LCDM, etc., and turn them into Cool Beer and Lambda Cold Dark Matter, respectively.
- Brief (1 sentence) description
- This will probably be the main sentence of the full description, like: "Catalog of galaxies clusters modelled using a heuristic star creation algorithm." But don't quote me.
- Full description
- 1 to 2 paragraphs (or more) describing the catalog. This is where to put information about the models used for each catalog, and how they differed.
Verbiage
Surf the web for other sites serving up reseach data, like the NVO, ADIL, or anything that says something like "funding for this project was made possible by NASA grant FOOBAR". Just like in sports, it is very important to acknowledge your sponsors. Find some examples, particularly of sites displaying results from NASA funded projects.
Once you have a few examples, Mike should be able to give the info to make a blurb the website. The blurb should only be a sentence, and will almost definitely look like:
Funding for this project was provided by NASA grant FOO.That sentence will be used on the bottom of every page, and a modified version should appear in the description of the archive.