### Subscribe to the PwC Newsletter
×
Stay informed on the latest trending ML papers with code, research developments, libraries, methods, and datasets.
[Read previous issues](/newsletter)
Subscribe
##### Join the community
×
You need to [log in](/accounts/login?next=/) to edit.
You can [create a new account](/accounts/register?next=/) if you don't have one.
[
Top
](/)[
New
](./latest)[
Greatest
](./greatest)
Trending Research
-----------------
Subscribe
[
](/paper/vggt-visual-geometry-grounded-transformer)
[VGGT: Visual Geometry Grounded Transformer](/paper/vggt-visual-geometry-grounded-transformer)
==============================================================================================
[facebookresearch/vggt](https://github.com/facebookresearch/vggt) • • 14 Mar 2025
We present VGGT, a feed-forward neural network that directly infers all key 3D attributes of a scene, including camera parameters, point maps, depth maps, and 3D point tracks, from one, a few, or hundreds of its views.
[Depth Estimation](/task/depth-estimation) [Novel View Synthesis](/task/novel-view-synthesis) [**+2**](/paper/vggt-visual-geometry-grounded-transformer#tasks)
2,243
11.37 stars / hour
[Paper](/paper/vggt-visual-geometry-grounded-transformer)
[Code](/paper/vggt-visual-geometry-grounded-transformer#code)
[
](/paper/neural-fields-with-thermal-activations-for)
[Neural Fields with Thermal Activations for Arbitrary-Scale Super-Resolution](/paper/neural-fields-with-thermal-activations-for)
================================================================================================================================
[prs-eth/thera](https://github.com/prs-eth/thera) • • 29 Nov 2023
We present a novel way to design neural fields such that points can be queried with an adaptive Gaussian PSF, so as to guarantee correct anti-aliasing at any desired output resolution.
[Image Super-Resolution](/task/image-super-resolution)
525
2.50 stars / hour
[Paper](/paper/neural-fields-with-thermal-activations-for)
[Code](/paper/neural-fields-with-thermal-activations-for#code)
[
](/paper/txagent-an-ai-agent-for-therapeutic-reasoning)
[TxAgent: An AI Agent for Therapeutic Reasoning Across a Universe of Tools](/paper/txagent-an-ai-agent-for-therapeutic-reasoning)
=================================================================================================================================
[mims-harvard/TxAgent](https://github.com/mims-harvard/TxAgent) • • 14 Mar 2025
It selects tools based on task objectives and executes structured function calls to solve therapeutic tasks that require clinical reasoning and cross-source validation.
[AI Agent](/task/ai-agent) [Decision Making](/task/decision-making)
269
2.49 stars / hour
[Paper](/paper/txagent-an-ai-agent-for-therapeutic-reasoning)
[Code](/paper/txagent-an-ai-agent-for-therapeutic-reasoning#code)
[
](/paper/reasongraph-visualisation-of-reasoning-paths)
[ReasonGraph: Visualisation of Reasoning Paths](/paper/reasongraph-visualisation-of-reasoning-paths)
====================================================================================================
[ZongqianLi/ReasonGraph](https://github.com/ZongqianLi/ReasonGraph) • 6 Mar 2025
Large Language Models (LLMs) reasoning processes are challenging to analyze due to their complexity and the lack of organized visualization tools.
344
2.14 stars / hour
[Paper](/paper/reasongraph-visualisation-of-reasoning-paths)
[Code](/paper/reasongraph-visualisation-of-reasoning-paths#code)
[
](/paper/reinforcement-learning-outperforms-supervised)
[Reinforcement Learning Outperforms Supervised Fine-Tuning: A Case Study on Audio Question Answering](/paper/reinforcement-learning-outperforms-supervised)
===========================================================================================================================================================
[xiaomi-research/r1-aqa](https://github.com/xiaomi-research/r1-aqa) • • 14 Mar 2025
Recently, reinforcement learning (RL) has been shown to greatly enhance the reasoning capabilities of large language models (LLMs), and RL-based approaches have been progressively applied to visual multimodal tasks.
[Audio Question Answering](/task/audio-question-answering) [Question Answering](/task/question-answering) [**+1**](/paper/reinforcement-learning-outperforms-supervised#tasks)
168
1.81 stars / hour
[Paper](/paper/reinforcement-learning-outperforms-supervised)
[Code](/paper/reinforcement-learning-outperforms-supervised#code)
[
](/paper/step-video-ti2v-technical-report-a-state-of)
[Step-Video-TI2V Technical Report: A State-of-the-Art Text-Driven Image-to-Video Generation Model](/paper/step-video-ti2v-technical-report-a-state-of)
======================================================================================================================================================
[stepfun-ai/step-video-ti2v](https://github.com/stepfun-ai/step-video-ti2v) • 14 Mar 2025
We present Step-Video-TI2V, a state-of-the-art text-driven image-to-video generation model with 30B parameters, capable of generating videos up to 102 frames based on both text and image inputs.
[Image to Video Generation](/task/image-to-video)
86
1.62 stars / hour
[Paper](/paper/step-video-ti2v-technical-report-a-state-of)
[Code](/paper/step-video-ti2v-technical-report-a-state-of#code)
[
](/paper/kblam-knowledge-base-augmented-language-model)
[KBLaM: Knowledge Base augmented Language Model](/paper/kblam-knowledge-base-augmented-language-model)
======================================================================================================
[microsoft/KBLaM](https://github.com/microsoft/KBLaM) • • 14 Oct 2024
In this paper, we propose Knowledge Base augmented Language Model (KBLaM), a new method for augmenting Large Language Models (LLMs) with external knowledge.
[8k](/task/8k) [In-Context Learning](/task/in-context-learning) [**+6**](/paper/kblam-knowledge-base-augmented-language-model#tasks)
175
1.61 stars / hour
[Paper](/paper/kblam-knowledge-base-augmented-language-model)
[Code](/paper/kblam-knowledge-base-augmented-language-model#code)
[
](/paper/data-formulator-2-iteratively-creating-rich)
[Data Formulator 2: Iterative Creation of Data Visualizations, with AI Transforming Data Along the Way](/paper/data-formulator-2-iteratively-creating-rich)
===========================================================================================================================================================
[microsoft/data-formulator](https://github.com/microsoft/data-formulator) • 28 Aug 2024
Data analysts often need to iterate between data transformations and chart designs to create rich visualizations for exploratory data analysis.
[Code Generation](/task/code-generation) [Navigate](/task/navigate)
9,873
1.53 stars / hour
[Paper](/paper/data-formulator-2-iteratively-creating-rich)
[Code](/paper/data-formulator-2-iteratively-creating-rich#code)
[
](/paper/2503-01710)
[Spark-TTS: An Efficient LLM-Based Text-to-Speech Model with Single-Stream Decoupled Speech Tokens](/paper/2503-01710)
======================================================================================================================
[sparkaudio/spark-tts](https://github.com/sparkaudio/spark-tts) • • 3 Mar 2025
Recent advancements in large language models (LLMs) have driven significant progress in zero-shot text-to-speech (TTS) synthesis.
[Attribute](/task/attribute) [Text to Speech](/task/text-to-speech) [**+1**](/paper/2503-01710#tasks)
5,259
1.53 stars / hour
[Paper](/paper/2503-01710)
[Code](/paper/2503-01710#code)
[
](/paper/lhm-large-animatable-human-reconstruction)
[LHM: Large Animatable Human Reconstruction Model from a Single Image in Seconds](/paper/lhm-large-animatable-human-reconstruction)
===================================================================================================================================
[aigc3d/LHM](https://github.com/aigc3d/LHM) • • 13 Mar 2025
Animatable 3D human reconstruction from a single image is a challenging problem due to the ambiguity in decoupling geometry, appearance, and deformation.
[3D Human Reconstruction](/task/3d-human-reconstruction)
314
1.39 stars / hour
[Paper](/paper/lhm-large-animatable-human-reconstruction)
[Code](/paper/lhm-large-animatable-human-reconstruction#code)
[](?page=2)
Contact us on: [[email protected]](mailto:[email protected]) . Papers With Code is a free resource with all data licensed under [CC-BY-SA](https://creativecommons.org/licenses/by-sa/4.0/).
[Terms](/site/terms) [Data policy](/site/data-policy) [Cookies policy](/site/cookies-policy) [from](/about#team)