Empirical Research

Forget My Name

I am incredibly fortunate that a group of Japanese researchers has already done much of the hard work of figuring out how to turn the hundreds of gigabytes of SGML documents I’m working with into a nice handy database, and moreover has given me the code they used to do it. Instead of figuring out how to do what they did on my own, I simply have to figure out what they did, and then decide what I’d like to do differently. As with everything else this summer, this task highlights important cultural differences.

Today I’ve been going through the specification for the raw government data I’ve been given and comparing it to the code given to me by the Japanese researchers whose work I’m building on, to see what they included in their dataset and what they left out. The raw government data includes a significant amount of low-sensitivity personally identifiable information. This is mainly name (and sometimes address) information about individuals and firms who have applied for trademark registrations, about the attorneys who represent them, and about the examiners–government employees all–who consider their applications.

Similar information appears in the US government’s data on trademark applications. The US government released all this data to the public several years ago, and continues to update it on a regular basis, which means that the names and addresses of applicants and their attorneys, and the names of examining attorneys and their supervisors, are all part of the public record–freely available to anyone with the interest and wherewithal to find them.

I know a few people who were pretty shocked to learn that you could search the USPTO’s free, public trademarks dataset by examiner name, and find any examiner’s entire work history–how much of a pushover they are; how quickly they work, how long they’ve been on the job, how often they’ve been overruled, etc. I’m sure there are lots of USPTO examining attorneys who would be shocked to learn that fact. But my sense is that in the US that kind of openness about government employees, and low-sensitivity personally identifiable information about individuals who access government functions, is pretty standard. And in most of the rest of the world, it’s just not.

The Japanese researchers whose work I am building on did not include information about the examiners who reviewed applications in their dataset at all–they never even retrieved it during processing. I happen to think that correlating application outcomes by examiner is interesting and potentially useful, and I’m going to modify the code I’ve been given to extract that information from the raw data. But in deference to what I take to be cultural norms regarding the privacy of personally identifiable information–norms that I know many of my compatriots would like to import into the US–I think I will probably anonymize the examiner data before reporting my results.

Lost in Translation

Yes, the title of this post is a cliché. It was bound to happen at some point on this trip. But I promise it’s really appopriate to today’s post.

I’m finishing out my first week in Japan, and I have been overwhelmed by the generosity and support of everyone I’ve met. Everyone I’ve interacted with in a professional capacity has underpromised and overdelivered. For example: Continue reading…

Summer in Japan: Early Observations on Data and Culture

I arrived in Tokyo two days ago, and have already begun work at the Institute of Intellectual Property, digging in to the Japan Patent Office’s (JPO’s) trademark registration data. I’ve worked with several countries’ intellectual property data systems by now, and I’m starting to think they may provide a window into the societies that produced them–though I’m still too jet-lagged to thoughtfully analyze the connection. Besides which, any analysis purporting to draw such a connection would inevitably be reductive and probably chauvinistic. So, purely by way of observation:

Continue reading…

Going to Tokyo: I’ve Been Appointed an “Invited Researcher” by Japan’s Institute of Intellectual Property

I’m very excited to announce that the Institute of Intellectual Property in Tokyo has invited me to participate in its Invited Overseas Researcher Program this coming summer. Under an agreement with the Japan Patent Office, each year IIP invites a small number of foreign researchers to come to Tokyo to study Japan’s industrial property system. (Past researchers can be found here.) I’ll be spending several weeks in Tokyo this summer doing empirical research into Japan’s trademark registration system (as a foundation for the kind of work discussed in this post). Many thanks to Kevin Collins (who did this program last year) for flagging this opportunity, and to Barton Beebe, Graeme Dinwoodie, and Jay Kesan (also a previous participant in the IIP program) for their support.

Trademarks and Economic Activity

There’s an increasing amount of empirical data available on trademark registration systems. The USPTO released a comprehensive dataset three years ago, and there are less complete and less user-friendly data sources available from other national and regional offices–though some offices make it a bit tricky to get their data, and others restrict access or charge for their data products. As with most trends in legal scholarship, the empirical turn has come late to the study of trademarks. Part of this is because the scholarly community is small, and not as quantitatively-minded as other disciplines. Part of it is because it’s not clear what questions regarding trademarks we might look to empirical evidence to answer. I’ve published a study of the impact of the federal antidilution statute on federal registration (spoiler alert: it adds to the cost of registration but doesn’t seem to affect outcomes), but that’s a pretty narrow issue. What else could we learn from this kind of data?

One possibility is to examine the link between trademarks and economic activity. People who make a living from commerce involving intellectual property like to emphasize how important IP protection is to the economy, though the numbers they throw around are a bit dubious. But if we were serious about it, could we rigorously draw some link between trademarks–which are the most common and ubiquitous form of intellectual property in the economy–and economic performance?

I’ve been thinking about how we might do so, so I brought my modest quantitative analytical skills to bear on the best data currently available: the USPTO’s dataset. I thought I’d just look to see whether there is any relationship between trademark activity (in this case, applications for federal trademark registrations) and economic activity (in this case, real GDP). And it seems that there is one…kind of.

TM Apps vs USGDP

The GDP data from the St. Louis Fed is reported quarterly and seasonally adjusted; I compiled the trademark application data on a quarterly basis and calculated a 4-quarter moving average as a seasonal smoothing kludge. We see that trademark application activity is strongly seasonal, and that it tends to roughly track GDP trends–perhaps with a bit of a lag. The lag is interesting if more rigorous analysis bears it out: it seems to suggest that trademarks, rather than driving economic activity, are merely a lagging indicator of that activity.

The big exception is the late 1990s to the early 2000s. As Barton Beebe documented in his first look at USPTO data, this spike in trademark activity seems to correspond with the dot-com boom and bust. (Registration rates also dropped during this period–lots of these applications were low-quality or quickly abandoned.) It’s interesting to see that this huge discontinuity in trademark application activity doesn’t correlate with anywhere near as big an impact in the overall economy. We could speculate about why that might be–it probably has something to do with the “gold-rush” scramble to occupy a new, untapped field of commerce, and I suspect it also reflects (poorly) on the value of the early web to the overall economy.

This is an example of the kind of analysis these new data sources might be useful for–and it’s not that tricky to carry out. Building this chart was a couple hours’ work, and I’m no expert. A more rigorous econometric model is beyond my expertise, but I’m sure it could be done (I’m less sure what we could learn from it). What other kinds of questions might we look to trademark data to answer?