Quick EAD changes with Notepad ++ and Scraper

Just a few tricks I picked up while editing a large-ish (56,095-line) EAD file in Notepad ++. Wanted to pass along in case it’s helpful to anyone.

The Why

Personal name notes were added to the file level in a finding aid. Near the end of the project, it was determined that these would do better being somewhere else (for reasons we won’t get into here).

The What

Removing  these names through Archivists’ Toolkit would be a tedious job for our processors. It involves activating each file-level record, switching to the ‘Names & Subjects’ tab, deleting the name and saving the record. And, of course, AT would probably freeze a few times during the process, making sure everyone’s good and miserable.

I wanted to see if I could help.

The How

Removing text within tags

Exporting the EAD from AT, I open it in Notepad++. Following the helpful answers from this Stacks Overflow question, I use

 <ns2:persname source="ingest">.*?</ns2:persname>

to get the personal names, replace it blank, and

<ns2:corpname source="ingest">.*?</ns2:corpname>

to find all corporate names, replacing them again with nothing.

Because I only want to do that for the lower-levels of the finding aid, I highlight only content within the <dsc></dsc> tags. Find the opening <dsc>. Start the highlight (Edit -> begin/end Select). Find the </dsc>. End the highlight (Edit -> begin/end highlight). Ensure the “In Selection” option is checked in the replace box, and the above steps will leave the collection-level access control names where they are.

Screen Scraping

I don’t, however, want to just through away all the names (even though I’m not sure I want them all at the collection level). To save them, I step away from Notepad ++ and go back to my browser.

Following this tutorial from a few years ago, I used the Scraper Chrome extension to select an added name. I modified the XPath as needed (see screenshot), and exported the full list of names (1,963) to a Google Spreadsheet.

Using the Scraper extension for Chrome. To scrape a list, choose an example, and modify the XPath as needed to ensure all are chosen.


And Done

I then delete the collection from Archivists’ Toolkit and import the modified EAD.

This 10-15 minute project saved a lot of work and hassle for the processors. We’re looking into ArchivesSpace, and are likely to start using it in production in the near(ish) future. Until then (at least) it’s often handy to have non-Archivists’ Toolkit ways of manipulating EAD data quickly.


Quick Note: TimelineJS and Drupal Secure Pages

Some time ago, we launched a gallery to highlight some digital content. We were excited to Timeline JS. After weeks of developing our timeline on our sandbox Drupal instance and writing content, we were surprised and disappointed to see that the timeline didn’t work in production. I found the solution online (link) and wanted to also post about it here in case it makes it easier to find for others in the future.

When we moved to our production site our timeline wouldn’t load. Very frustrating. Much grumbling. Even some cold sweat as the deadline loomed.

Our live Drupal site is using Secure Pages. As such our pages are HTTPS. Timeline JS can’t load on secure pages with the default embed URL. Luckily, you can replace the default embed URL “” with “”

A very special thanks to the folks who posted this answer: And of course, a big thank you to the Timeline JS project for the great tool.

See Also:
Our use of Timeline JS:


Finding Aid Tutorial with jQuery Impromptu

Another step along the never-ending quest to make things easier for our patrons:

Recently, the APS went live with a tutorial that I’ve been working on, off and on, for a little while. Using jQuery Impomptu, I wanted to create a ‘product tour’-like introduction to how to request manuscripts. We use a number of different systems — XTF for searching and serving the finding aids, Aeon for requesting — and there’s no reason to think any of this is intuitive to anyone.

This is how to tell us what you want. A screen shot for the tutorial.
This is how to tell us what you want.

Originally I wanted to integrate the tour directly in the products themselves, but this is challenging because (1) as the user follows the steps they would be taken to different pages, and (2) many of our pages are fluid or responsive, and I found that I had a much easier time of all this using a fixed-width HTML page. For these reasons, I decided to make a stand-alone HTML page with images of what the user will be seeing. This creates an awkward finish (I just leave the user with links), but solves many of the other problems.

The HTML is pretty simple: just a bunch of images in an article, inside the body. A sample:

			<h1 id="artcle-header">Finding Aid Tutorial</h1>
			<div id="coll-search">
				<img src="...path-to.gif">
			<div id="coll-search-return">
				<img src="...path-to.gif">
			<div id="ToC-One">
				<img src="...path-to.gif">
			<div id="request">
			<figure >
				<img src="...path-to.gif">
			<div id="aeon-login">
				<img src="...path-to.gif">


jQuery Impromptu uses ‘states’. Here’s what one of mine look like. (Note the x and y coordinates. I really wanted my arrow to point to specific things on the images, and spent some time trying to get that just right. Again, hence the fixed-width page.)

/*State 10 Navigation Bold*/
						title: 'Orientation',
						html:"As you click on an item in the table of contents, it will remain bold. As some finding aids can be quite large, this will help you stay oriented.",
						buttons: { Back: -1, Next: 1 },
						focus: 1,
						position: {
							container: '#ToC-One', 
							x: 150, 
							y: 7, 
							width: 450, 
							arrow: 'lb'
							if(v == -1){
							if(v == 1){


jQuery Improptu is not the only option for product tours (Boostrap Tour is another), but if you do go with Impromtu, do yourself a favor and buy the e-book. It’s very helpful and Trent Richardson, the creator, was nice enough to answer a silly question or two from me.


Thought I would pass this along in case anyone finds it helpful.

Finding aid tutorial:

PHP Variables for Menu Items

Nothing ground breaking here for the (even semi-) experience programers out there, but this has been a big help for me, so I thought I’d pass it along.

It’s helpful to have the menu item for the current page to be different in some way — different color, usually. This is commonly built into frameworks and CMS solutions. But when building out the code for a recent web exhibition, this gave me pause.

I got it to work by using PHP to assign CSS classes.

Starting with the HTML of  my menu:

<ul class="nav">
	<li><a href="index.php">Home</a></li>
	<li><a href="case-one.php">Case 1</a></li>
	<li><a href="case-two.php">Case 2</a></li>
	<li><a href="case-three.php">Case 3</a></li>
	<li><a href="case-four.php">Case 4</a></li>
	<li><a href="full-text.php">Full Text</a></li>
	<li><a href="credits.php">Credits</a></li>

And knowing that I want a class for the active page (I love simplicity, so I’ll call that class “active”), I start by adding tag space for class.

<ul class="nav">
	<li class=""><a href="index.php">Home</a></li>
	<li class=""><a href="case-one.php">Case 1</a></li>
	<li class=""><a href="case-two.php">Case 2</a></li>
	<li class=""><a href="case-three.php">Case 3</a></li>
	<li class=""><a href="case-four.php">Case 4</a></li>
	<li class=""><a href="full-text.php">Full Text</a></li>
	<li class=""><a href="credits.php">Credits</a></li>

Then I use the following PHP  within the class quotes:

"<?php if ($section == "-") { echo "active"; } ?>"

Within the quotes of the section variable, I add a value that’s meaningful for each page” “home”, “case1”, “credits”, etc. For example:

"<?php if ($section == "home") { echo "active"; } ?>"

So the full nav markup looks like this:

<ul class="nav">
	<li class="<?php if ($section == "home") { echo "active"; } ?>"><a href="index.php">Home</a></li>
	<li class="<?php if ($section == "case1") { echo "active"; } ?>"><a href="case-one.php">Case 1</a></li>
	<li class="<?php if ($section == "case2") { echo "active"; } ?>"><a href="case-two.php">Case 2</a></li>
	<li class="<?php if ($section == "case3") { echo "active"; } ?>"><a href="case-three.php">Case 3</a></li>
	<li class="<?php if ($section == "case4") { echo "active"; } ?>"><a href="case-four.php">Case 4</a></li>
	<li class="<?php if ($section == "full") { echo "active"; } ?>"><a href="full-text.php">Full Text</a></li>
	<li class="<?php if ($section == "credits") { echo "active"; } ?>"><a href="credits.php">Credits</a></li>

Lastly: to make it all work, add the variable to the individual pages. For the home page, for example:

<?php $section = "home";

When that page loads the conditional

if ($section == "home")

will be true and that menu item will have the CSS class “active”.
To see how it works out in the example I’ve been using, feel free to check out the exhibition over here.

“No Taxonomy Without Representation,” A New Web Exhibition

I just finished putting together a web exhibition, “No Taxonomy Without Representation,” and I think it turned out pretty well.

It was a good excuse to use some new jquery plugins, including Tooltipster, and a good excuse to play with Bootstrap.

Introducing Libipsum: The Library/Archives-Related Ipsum Generator

I’ve been working with a lot of lorem ipsum lately for library web exhibit mock-ups and the like. There are a lot of fun ipsum generators out there (see, for example Mashable‘s list). For about the last month or so have been idly wondering if I’d be able to make one relevant to the library/archives world.

Good news!

I came upon Justin Kestler’s very helpful blog post and source files, and modified the terminology and a few other things, and am now proud to present, the ipsum generator for the library/archives world.

It’s built with javascript and jQuery, it’s lightweight and all around pretty neat.

Justin, if you’re reading this, thank you! This was a lot of fun to work on.

Jargon in Finding Aids

Sometimes I’m more bothered by this than at other times, but with a recent Finding Aid re-work I thought I’d try out a few ways of breaking through archives-related jargon.

The archives profession is not as bad as some others, luckily, but we do adhere to some terminology that’s not immediately understandable. Scope & Content, anyone?

It might be helpful to have a way to offer a one-sentence explanation to help folks who aren’t used to thinking about finding aids. After all, we do want people to find our stuff, and part of that is finding our finding aids useful.

To that end, I’ve started playing with a solution using JQuery and JQuery UI.

It’s always easier to show than to tell. Here are some screenshots:

Table of Contents, including jargon
Table of Contents, including jargon
Hovering over the Table of Content entry shows a “what’s this?” link.
Hovering over “What’s this?” shows the explanation.

In short, within a table of contents (or any other place, really), we have a “what’s this?” link appear when a item is hovered over. When the “what’s this?” link is hovered over a tooltip appears with a short explanation.

And you can see a live demo here:

It’s still a work in progress, but it might turn into something helpful.

F U, Ctrl+F

Detailed finding aids can get pretty complicated. Serving these finding aids to patrons can be challenging.

A common approach is the use of cntrl+F to find items of interest. We all know how this works. A patron is looking a specific name or date within a large finding aid. We suggest that they hold down those two magic keys and type in what they’re looking for.

The problem, though, is that this tends to demolish all that context we’ve been working so hard to establish.

Take, for example, the following common scenario from my shop:

Patron is looking for a name (Van Bibber, say) in a specific collection:
“Excellent!” we say. “Simply use ctrl+F in the finding aid to find the name!” The problem? We can find Van Bibber, but we’ve lost the context of the item. Was that in Series 1 ? Series 2? For that matter, what were those series?

Searching with Ctrl+F often strips all context from the item
Searching with Ctrl+F often strips all context from the item

To help mitigate this, we’ve been exploring various alternatives to the layout of our finding aids.

Instead of putting the Table of Contents (ToC) at the top, we’ve moved it to the left.  Using a simple grid design, we’ve given both the ToC and the Content sections a fixed position and percentage width to ensure that the overall layout of the finding aid remains available regardless of where one is in the document.

For a long time we've had the Table of Contents at the top of the finding aid. As such the ToC -- and all the context it supplies -- is lost as soon as a patron scrolls past it.
For a long time we’ve had the Table of Contents at the top of the finding aid. As such the ToC — and all the context it supplies — is lost as soon as a patron scrolls past it.
Our new design keeps the ToC on the left, ensuring that the overall layout of the document remains visible.

To help orient the user even more, we’ve used jQuery to add a new class to each link as it’s clicked. We’ve styled the class to make that link bold. So when one clicks on “Series 1”, say, that link stays bold until another one is clicked. Pretty handy.

We’ve also been noticing a lot iPads making their way into the reading room. We’re supplementing our base CSS with media queries to be sure that our new design scales well on various size screens.  At max-width 724 and below, we’ve reverted back to the ToC-at-top look (just in case anyone wants to browse our finding aids on their phones)

More work to do

All that’s pretty great. And it’s solving a lot of problems. But we still want to do more. Specifically, I’d love to get something akin to Bootstrap‘s ScrollSpy to work. That would enable us to style the link in the ToC based on where a patron was viewing regardless of a click action. As of the time of this writing, I’m still hammering away at this. Please: be sure to let me know if you’ve done this in your shop!

See more examples

It might be something in the air the last couple of years, but it feels like plenty of repositories are working on finding aid revisions. I’ve found both the Princeton University’s new(ish) Finding Aids Site  (example finding aid: ; and  see their Code4Lib article here), and the Rockefeller Center’s new Dime site (example finding aid: particularly inspiring and impressive. You guys rock!

Revamping this site

I know, I know, everyone hates a “there’ll be content here soon, I promise” post, but here it is anyway.

I’m in the process of revamping this site from a series of static pages that I’ve been using as a parking lot for project descriptions to an active (hopefully) blog on the adventures of librarian-based web development.

I’ve got some pretty neat projects in the works, and hope to share them soon.