CiteSeerx
data and metadata are available for others to use. Data available includes CiteSeerx metadata, databases, data sets of pdf files
and text of pdf files. For more information, please contact us directly.
CiteSeerx is compliant with the
Open Archives Initiative Protocol
for Metadata Harvesting, which is a standard proposed by
The Open Archive Initiative in order to facilitate content dissemination. For data not mentioned here, please contact us through feedback.
To browse or download records programmatically from CiteSeerx OAI collection please use the harvest url:
http://citeseerx.ist.psu.edu/oai2
The archive may also be browsed from an interface via an OAI Repository Explorer, either by using the CiteSeerx archive identifier or by directly entering the harvest url.
Here is a list of toolkits that can be used for OAI metadata harvesting.
- OAI-Harvester - perl
- OAIHarvester2 - Java
- .NET OAI Harvester - .NET (dll)
- UIUC OAI - UIUC OAI Metadata Harvesting Project.

