-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
57b70d0
commit 82412a3
Showing
22 changed files
with
3,337 additions
and
6 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,48 @@ | ||
# Academic Project Page Template | ||
This is an academic paper project page template. | ||
|
||
|
||
Example project pages built using this template are: | ||
- https://vision.huji.ac.il/spectral_detuning/ | ||
- https://vision.huji.ac.il/podd/ | ||
- https://dreamix-video-editing.github.io | ||
- https://vision.huji.ac.il/conffusion/ | ||
- https://vision.huji.ac.il/3d_ads/ | ||
- https://vision.huji.ac.il/ssrl_ad/ | ||
- https://vision.huji.ac.il/deepsim/ | ||
|
||
|
||
|
||
## Start using the template | ||
To start using the template click on `Use this Template`. | ||
|
||
The template uses html for controlling the content and css for controlling the style. | ||
To edit the websites contents edit the `index.html` file. It contains different HTML "building blocks", use whichever ones you need and comment out the rest. | ||
|
||
**IMPORTANT!** Make sure to replace the `favicon.ico` under `static/images/` with one of your own, otherwise your favicon is going to be a dreambooth image of me. | ||
|
||
## Components | ||
- Teaser video | ||
- Images Carousel | ||
- Youtube embedding | ||
- Video Carousel | ||
- PDF Poster | ||
- Bibtex citation | ||
|
||
## Tips: | ||
- The `index.html` file contains comments instructing you what to replace, you should follow these comments. | ||
- The `meta` tags in the `index.html` file are used to provide metadata about your paper | ||
(e.g. helping search engine index the website, showing a preview image when sharing the website, etc.) | ||
- The resolution of images and videos can usually be around 1920-2048, there rarely a need for better resolution that take longer to load. | ||
- All the images and videos you use should be compressed to allow for fast loading of the website (and thus better indexing by search engines). For images, you can use [TinyPNG](https://tinypng.com), for videos you can need to find the tradeoff between size and quality. | ||
- When using large video files (larger than 10MB), it's better to use youtube for hosting the video as serving the video from the website can take time. | ||
- Using a tracker can help you analyze the traffic and see where users came from. [statcounter](https://statcounter.com) is a free, easy to use tracker that takes under 5 minutes to set up. | ||
- This project page can also be made into a github pages website. | ||
- Replace the favicon to one of your choosing (the default one is of the Hebrew University). | ||
- Suggestions, improvements and comments are welcome, simply open an issue or contact me. You can find my contact information at [https://pages.cs.huji.ac.il/eliahu-horwitz/](https://pages.cs.huji.ac.il/eliahu-horwitz/) | ||
|
||
## Acknowledgments | ||
Parts of this project page were adopted from the [Nerfies](https://nerfies.github.io/) page. | ||
|
||
## Website License | ||
<a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-sa/4.0/88x31.png" /></a><br />This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/">Creative Commons Attribution-ShareAlike 4.0 International License</a>. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,274 @@ | ||
<!DOCTYPE html> | ||
<html> | ||
<head> | ||
<meta charset="utf-8"> | ||
<!-- Meta tags for social media banners, these should be filled in appropriatly as they are your "business card" --> | ||
<!-- Replace the content tag with appropriate information --> | ||
<meta name="description" content="DESCRIPTION META TAG"> | ||
<meta property="og:title" content="SOCIAL MEDIA TITLE TAG"/> | ||
<meta property="og:description" content="SOCIAL MEDIA DESCRIPTION TAG TAG"/> | ||
<meta property="og:url" content="URL OF THE WEBSITE"/> | ||
<!-- Path to banner image, should be in the path listed below. Optimal dimenssions are 1200X630--> | ||
<meta property="og:image" content="static/image/your_banner_image.png" /> | ||
<meta property="og:image:width" content="1200"/> | ||
<meta property="og:image:height" content="630"/> | ||
|
||
|
||
<meta name="twitter:title" content="TWITTER BANNER TITLE META TAG"> | ||
<meta name="twitter:description" content="TWITTER BANNER DESCRIPTION META TAG"> | ||
<!-- Path to banner image, should be in the path listed below. Optimal dimenssions are 1200X600--> | ||
<meta name="twitter:image" content="static/images/your_twitter_banner_image.png"> | ||
<meta name="twitter:card" content="summary_large_image"> | ||
<!-- Keywords for your paper to be indexed by--> | ||
<meta name="keywords" content="diffusion-models, remote sensing, satellite-imagery"> | ||
<meta name="viewport" content="width=device-width, initial-scale=1"> | ||
|
||
|
||
<title>PSM</title> | ||
<link rel="icon" type="image/x-icon" href="static/images/favicon.ico"> | ||
<link href="https://fonts.googleapis.com/css?family=Google+Sans|Noto+Sans|Castoro" | ||
rel="stylesheet"> | ||
|
||
<link rel="stylesheet" href="static/css/bulma.min.css"> | ||
<link rel="stylesheet" href="static/css/bulma-carousel.min.css"> | ||
<link rel="stylesheet" href="static/css/bulma-slider.min.css"> | ||
<link rel="stylesheet" href="static/css/fontawesome.all.min.css"> | ||
<link rel="stylesheet" | ||
href="https://cdn.jsdelivr.net/gh/jpswalsh/academicons@1/css/academicons.min.css"> | ||
<link rel="stylesheet" href="static/css/index.css"> | ||
|
||
<script src="https://ajax.googleapis.com/ajax/libs/jquery/3.5.1/jquery.min.js"></script> | ||
<script src="https://documentcloud.adobe.com/view-sdk/main.js"></script> | ||
<script defer src="static/js/fontawesome.all.min.js"></script> | ||
<script src="static/js/bulma-carousel.min.js"></script> | ||
<script src="static/js/bulma-slider.min.js"></script> | ||
<script src="static/js/index.js"></script> | ||
<link rel="stylesheet" | ||
href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/10.0.3/styles/default.min.css"> | ||
<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/10.0.3/highlight.min.js"></script> | ||
<script>hljs.initHighlightingOnLoad();</script> | ||
<script | ||
type="module" | ||
src="https://gradio.s3-us-west-2.amazonaws.com/4.25.0/gradio.js" | ||
></script> | ||
</head> | ||
<body> | ||
|
||
|
||
<section class="hero"> | ||
<div class="hero-body"> | ||
<div class="container is-max-desktop"> | ||
<div class="columns is-centered"> | ||
<div class="column has-text-centered"> | ||
<h1 class="title is-1 publication-title">PSM: Learning Probabilistic Embeddings for Multi-scale Zero-Shot Soundscape Mapping</h1> | ||
<div class="is-size-5 publication-authors"> | ||
<!-- Paper authors --> | ||
<span class="author-block"> | ||
<a href="https://subash-khanal.github.io/" target="_blank">Subash Khanal</a>,</span> | ||
<span class="author-block"> | ||
<a href="https://ericx003.github.io" target="_blank">Eric Xing</a>,</span> | ||
<span class="author-block"> | ||
<a href="https://vishu26.github.io/" target="_blank">Srikumar Sastry</a>, | ||
</span> | ||
<span class="author-block"> | ||
<a href="https://sites.wustl.edu/aayush/" target="_blank">Aayush Dhakal</a>, | ||
</span> | ||
<span class="author-block"> | ||
<a href="https://steven-xiong.github.io/" target="_blank">Zhexiao Xiong</a>, | ||
</span> | ||
<span class="author-block"> | ||
<a href="https://adealgis.wixsite.com/adeel-ahmad-geog/" target="_blank">Adeel Ahmad</a>, | ||
</span> | ||
<span class="author-block"> | ||
<a href="https://jacobsn.github.io/" target="_blank">Nathan Jacobs</a> | ||
</span> | ||
</div> | ||
|
||
<div class="is-size-5 publication-authors"> | ||
<span class="author-block">Washington University<br>ACM Multimedia, 2024</span> | ||
</div> | ||
|
||
<div class="column has-text-centered"> | ||
<div class="publication-links"> | ||
<!-- Arxiv PDF link --> | ||
<span class="link-block"> | ||
<a href="https://arxiv.org/pdf/2408.07050" target="_blank" | ||
class="external-link button is-normal is-rounded is-dark"> | ||
<span class="icon"> | ||
<i class="fas fa-file-pdf"></i> | ||
</span> | ||
<span>Paper</span> | ||
</a> | ||
</span> | ||
|
||
<span class="link-block"> | ||
<a href="https://drive.google.com/drive/folders/1NJyba2hoQen_lDCgm9S4MymrsIZDQmgS" target="_blank" class="external-link button is-normal is-rounded is-dark"> | ||
<span>Models</span> | ||
</a> | ||
</span> | ||
|
||
<!-- Github link --> | ||
<span class="link-block"> | ||
<a href="https://github.com/mvrl/PSM" target="_blank" | ||
class="external-link button is-normal is-rounded is-dark"> | ||
<span class="icon"> | ||
<i class="fab fa-github"></i> | ||
</span> | ||
<span>GitHub</span> | ||
</a> | ||
</span> | ||
|
||
<!-- ArXiv abstract Link --> | ||
<span class="link-block"> | ||
<a href="https://arxiv.org/pdf/2408.07050" target="_blank" | ||
class="external-link button is-normal is-rounded is-dark"> | ||
<span class="icon"> | ||
<i class="ai ai-arxiv"></i> | ||
</span> | ||
<span>arXiv</span> | ||
</a> | ||
</span> | ||
</div> | ||
</div> | ||
</div> | ||
</div> | ||
</div> | ||
</div> | ||
</section> | ||
|
||
|
||
|
||
<!-- Paper abstract --> | ||
<section class="section hero is-light"> | ||
<div class="container is-max-desktop"> | ||
<div class="columns is-centered has-text-centered"> | ||
<div class="column is-four-fifths"> | ||
<h2 class="title is-3">Abstract</h2> | ||
<div class="content has-text-justified"> | ||
<p> | ||
A soundscape is defined by the acoustic environment a person perceives at a location. In this work, we propose a framework for mapping soundscapes across the Earth. Since soundscapes involve sound distributions that span varying spatial scales, we represent locations with multi-scale satellite imagery and learn a joint rep- resentation among this imagery, audio, and text. To capture the inherent uncertainty in the soundscape of a location, we design the representation space to be probabilistic. We also fuse ubiqui- tous metadata (including geolocation, time, and data source) to enable learning of spatially and temporally dynamic representa- tions of soundscapes. We demonstrate the utility of our framework by creating large-scale soundscape maps integrating both audio and text with temporal control. To facilitate future research on this task, we also introduce a large-scale dataset, GeoSound, contain- ing over 300𝑘 geotagged audio samples paired with both low- and high-resolution satellite imagery. We demonstrate that our method outperforms the existing state-of-the-art on both GeoSound and the existing SoundingEarth dataset. | ||
</p> | ||
</div> | ||
</div> | ||
</div> | ||
</div> | ||
</section> | ||
<!-- End paper abstract --> | ||
|
||
<section class="hero is-small"> | ||
<div class="hero-body"> | ||
<div class="container"> | ||
<h2 class="title is-3">🎯 Method</h2> | ||
<div class="columns is-centered has-text-centered"> | ||
<div class="column is-four-fifths"> | ||
|
||
<!-- First Image --> | ||
<div class="publication-image" id="image-container-1"> | ||
<img src="static/images/method.png" alt="method"> | ||
<h2 class="subtitle has-text-centered"> | ||
Our proposed framework, Probabilistic Soundscape Mapping (PSM), combines image, audio, and text encoders to learn a probabilistic joint representation space. Metadata, including geolocation (l), month (m), hour (h), audio-source (a), and caption-source (t), is encoded separately and fused with image embeddings using a transformer-based metadata fusion module. For each encoder, 𝜇 and 𝜎 heads yield probabilistic embeddings, which are used to compute probabilistic contrastive loss. | ||
</h2> | ||
</div> | ||
</div> | ||
</div> | ||
</div> | ||
</section> | ||
|
||
<!-- Satellite Image to Sound Retrieval Section --> | ||
<section class="section"> | ||
<div class="container is-max-desktop has-text-centered"> | ||
<h2 class="title">Satellite Image to Sound Retrieval Examples</h2> | ||
<video width="640" height="360" controls> | ||
<source src="static/videos/PSM_demo_compressed.mp4" type="video/mp4"> | ||
</video> | ||
</div> | ||
</section> | ||
<!-- End Satellite Image to Sound Retrieval Section --> | ||
|
||
|
||
<!-- Soundscape Maps Section --> | ||
<section class="section"> | ||
<div class="container is-max-desktop"> | ||
<h2 class="title">Soundscape Maps</h2> | ||
<div class="columns is-multiline"> | ||
|
||
<!-- Figure 1 --> | ||
<div class="column is-one-quarter has-text-centered"> | ||
<img src="static/images/figure1.png" alt="Figure 1 Description"> | ||
<h3 class="subtitle">Caption for Figure 1</h3> | ||
</div> | ||
|
||
<!-- Figure 2 --> | ||
<div class="column is-one-quarter has-text-centered"> | ||
<img src="static/images/figure2.png" alt="Figure 2 Description"> | ||
<h3 class="subtitle">Caption for Figure 2</h3> | ||
</div> | ||
|
||
<!-- Figure 3 --> | ||
<div class="column is-one-quarter has-text-centered"> | ||
<img src="static/images/figure3.png" alt="Figure 3 Description"> | ||
<h3 class="subtitle">Caption for Figure 3</h3> | ||
</div> | ||
|
||
<!-- Figure 4 --> | ||
<div class="column is-one-quarter has-text-centered"> | ||
<img src="static/images/figure4.png" alt="Figure 4 Description"> | ||
<h3 class="subtitle">Caption for Figure 4</h3> | ||
</div> | ||
|
||
</div> | ||
</div> | ||
</section> | ||
<!-- End Soundscape Maps Section --> | ||
|
||
|
||
|
||
<!--BibTex citation --> | ||
<section class="section" id="BibTeX"> | ||
<div class="container is-max-desktop content"> | ||
<h2 class="title">BibTeX</h2> | ||
<pre><code>@inproceedings{khanal2024psm, | ||
annotation = {remote_sensing,spotlight}, | ||
author = {Khanal, Subash and Xing, Eric and Sastry, Srikumar and Dhakal, Aayush and Xiong, Zhexiao and Ahmad, Adeel and Jacobs, Nathan}, | ||
thumbnail = {/thumbnails/psm.jpg}, | ||
booktitle = {ACM Multimedia}, | ||
title = {{PSM}: Learning Probabilistic Embeddings for Multi-scale Zero-shot Soundscape Mapping}, | ||
author+an = {7=highlight}, | ||
pdf = {https://arxiv.org/pdf/2408.07050}, | ||
eprint = {2408.07050}, | ||
archiveprefix = {arXiv}, | ||
primaryclass = {cs.CV}, | ||
month = oct, | ||
day = {28}, | ||
year = {2024}}</code></pre> | ||
</div> | ||
</section> | ||
<!--End BibTex citation --> | ||
|
||
|
||
<footer class="footer"> | ||
<div class="container"> | ||
<div class="columns is-centered"> | ||
<div class="column is-8"> | ||
<div class="content"> | ||
|
||
<p> | ||
This page was built using the <a href="https://github.com/eliahuhorwitz/Academic-project-page-template" target="_blank">Academic Project Page Template</a> which was adopted from the <a href="https://nerfies.github.io" target="_blank">Nerfies</a> project page. | ||
You are free to borrow the of this website, we just ask that you link back to this page in the footer. <br> This website is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/" target="_blank">Creative | ||
Commons Attribution-ShareAlike 4.0 International License</a>. | ||
</p> | ||
|
||
</div> | ||
</div> | ||
</div> | ||
</div> | ||
</footer> | ||
|
||
<!-- Statcounter tracking code --> | ||
|
||
<!-- You can add a tracker to track page visits by creating an account at statcounter.com --> | ||
|
||
<!-- End of Statcounter Code --> | ||
|
||
</body> | ||
</html> |
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
Oops, something went wrong.