This page contains the research projects that we have completed so far. In addition to papers, we also share the code and data produced through these projects.
GIS-based Data Augmentation for Deep Learning
One major challenge of using deep learning models is that they often require large amounts of training data that have to be manually labeled. We proposed a deep learning approach with GIS-based data augmentation that can automatically generate labeled training map images from shapefiles using GIS operations. We utilize such an approach to enrich the metadata of map images by adding spatial extents and place names extracted from map images.
Code and Data: https://github.com/geoai-lab/MapMetadataEnrichment
Historical Maps and Geographic Knowledge Graphs
Historical maps contain rich geographic information about the past of a region. This project proposes a general workflow for completing one important step of building a geographic knowledge graph (GKG) from historical maps, namely aligning the same geographic entities from different maps. We experiment with different methods for implementation, and systematically evaluate their performances using two datasets of historical maps.
Code and Data: https://github.com/geoai-lab/MapKnowledgeBase
NeuroTPR: a Neuro-net ToPonym Recognition model
This project develops NeuroTPR, a Neuro-net ToPonym Recognition model designed for extracting locations from social media messages. Our approach extends a general bidirectional long short-term memory (BiLSTM) model with a number of features designed to handle the language irregularity in social media messages. This model was applied to a dataset of tweets from 2017 Hurricane Harvey.
Code and Data: https://github.com/geoai-lab/NeuroTPR
Extensible and Unified Platform for Evaluating Geoparsers
This project develops EUPEG: an Extensible and Unified Platform for Evaluating Geoparsers. EUPEG is an open source and web‐based benchmarking platform which hosts the majority of open corpora, geoparsers, and performance metrics reported in the literature. It enables direct comparison of the geoparsers hosted, and a new geoparser can be connected to EUPEG and compared with other geoparsers.
Code and Data: https://github.com/geoai-lab/EUPEG
Topic Modeling and Sentiment Analysis
The perceptions of people toward neighborhoods reveal their satisfactions with their living environments and their perceived quality of life. In this project, we analyze online neighborhood review data to understand the perceptions of people toward neighborhoods. Specifically, we perform topic modeling to understand the semantic topics that people talk about their neighborhoods, and sentiment analysis to understand the emotions expressed by people.
Code and Data: https://github.com/geoai-lab/TopicAndSentimentAnalysis
Harvesting Local Place Names from Housing Posts
Local place names are frequently used by residents living in a geographic region. Such place names may not be recorded in existing gazetteers. In this work, we propose a computational framework for harvesting local place names from geotagged housing posts. We make use of those posts on local-oriented websites, such as Craigslist, where local place names are often mentioned. The proposed framework consists of two stages: natural language processing (NLP) and multi-scale geospatial clustering.
Code and Data: https://github.com/YingjieHu/LocalPlaceName
Place Names and Their Changes with Geographic Distance
In this project, we conduct an empirical study based on 112,071 POIs in seven US metropolitan areas extracted from an open Yelp dataset. We propose to adopt term frequency and inverse document frequency in geographic contexts to identify local terms used in POI names and to analyze their usages across different POI types. Our results show an uneven usage of local terms across POI types. We also examine the decaying effect of POI name similarity with the increase of distance among POIs.
Code and Data: https://github.com/YingjieHu/POI_Name
Semantic Relatedness between Cities via News Articles
This project develops a topic modeling based workflow for extracting semantic relatedness between cities from news articles. News articles contain a rich amount of information about cities and their relations. The developed workflow makes use of city co-occurrences in news articles, and employs a topic model, LLDA, to analyze the full texts of news articles and to extract city relatedness. The extracted semantic relatedness can contribute to applications such as urban planning and GIR.
Code and Data: https://github.com/YingjieHu/CityRelatednessViaNews