The 19th International Conference on
Document Analysis and
Recognition
September 16-21, 2025 Wuhan, Hubei, China
The 19th International Conference on
Document Analysis and
Recognition
September 16-21, 2025 Wuhan, Hubei, China
Accepted Papers
All accepted journal track papers will be presented orally at the conference.
Journal Tack
On self-supervision in historical handwritten document segmentation
Josef Baloun; Martin Prantl; Ladislav Lenc; Jiří Martínek; Pavel Král
Character Recognition for Greek Squeezes
Nicholas R. Howe; Feiran Chang; Isabella Falbo; Tajhini Brown; Aaron Hershkowitz
Neurosymbolic Information Extraction from Transactional Documents
Arthur Hemmer; Mickaël Coustaty; Nicola Bartolo; Jean-Marc Ogier
Enhancing Music Score Analysis with Monte Carlo Dropout: A Probabilistic Approach to Staff-Region Detection
Samuel B. Oliva-Bulpitt; Juan P. Martinez-Esteso; Alejandro Galan-Cuenca; Francisco J. Castellanos; Antonio Javier Gallego
SlimDoc: Lightweight Distillation of Document Transformer Models
Marcel Lamott; Muhammad Armaghan Shakir; Adrian Ulges; Yves-Noel Weweler; Faisal Shafait
Tabular Context-aware Optical Character Recognition and Tabular Data Reconstruction for Historical Records
Loitongbam Gyanendro Singh; Stuart E Middleton
The PARES Database: Information Extraction over Historical Parish Records
José Andrés; Casey Wall; Solène Tarride; Mickaël Coustaty; Alejandro H. Toselli; Enrique Vidal
Multi-level Style Control for Chinese Handwriting Generation
Gang Yao; Liangrui Peng; Zhiyu Li; Tianqi Zhao; Kemeng Zhao; Ning Ding; Yao Tao
A Low-Intervention Dual-Loop Iterative Process for Efficient Dataset Expansion and Classification in Palm Leaf Manuscript Analysis
NIMOL THUON; Jun Du; Panhapin Theang; Ratana Thuon
Lightweight Cross-Attention-based HookNet for Historical Handwritten Document Layout Analysis
Fei Wu; Mathias Seuret; Martin Mayr; Florian Kordon; Jochen Zöllner; Sebastian Wind; Andreas Maier; Vincent Christlein
Bertrand Duménieu; Ta-Chien Chan; Hsiung-Ming Liao; Wen-Rong Su
Oral Presentations
HIP: Hierarchical Point Modeling and Pre-training for Visual Information Extraction
Long, Rujiao*; Wang, Pengfei; Yang, Zhibo; Cheng, Wenqing
Dual Downsample Vision Transformer for Handwritten Text Recognition
Tan, Yew Lee*; Chew, Ernest Yu Kai; Kim, Jung-jae; Kong, Adams Wai-Kin
DCC: Plug-and-Play Dynamic Category Compression for Enhanced Handwritten Text Generation
Wang, Yiming; Wei , Hongxi*; Wang, Heng; Sun , Shiwen
Layout-Aware Text Editing for Efficient Transformation of Academic PDFs to Markdown
Duan, Changxu*
SFDLA: Source-Free Document Layout Analysis
Tewes, Sebastian; Chen, Yufan; Moured, Omar; Zhang, Jiaming*; Stiefelhagen, Rainer
ComicsPAP: understanding comic strips by picking the correct panel
Vivoli, Emanuele*; Llabrés, Artemis; Souibgui, Mohamed Ali; Bertini, Marco; Valveny, Ernest; Karatzas, Dimosthenis
SemiTabDETR: End-to-End Semi-Supervised Table Detection with Transformer-based Enhanced Query Approach
Shehzadi, Tahira*; Stricker, Didier ; Afzal, Muhammad Zeshan
Multi-Scale Convolution Combined with DTW for Online Signature Verification
Yang, Dengshan; Muhammat, Mahpirat; Xu, Xuebin; Aysa, Alimjan; Ubul, Kurban*
TexTAR : Textual Attribute Recognition in Multi-domain and Multi-lingual Document Images
Kumar, Rohan; Jinka, Jyothi Swaroopa; Sarvadevabhatla, Ravi Kiran*
Segmenting France Across Four Centuries
López Rauhut, Marta*; Zhou, Hongyu; Aubry, Mathieu; Landrieu, Loïc
Expertise Finding: Domain Extraction from Documents using Fuzzy Clustering
Sharma Kafle, Dipendra*; Talhi, Esma; Coustaty, Mickael; Doucet, Antoine
The Return of Structural Handwritten Mathematical Expression Recognition
Seitz, Jakob*; Lengfeld, Tobias; Timofte, Radu
FCD-Net: Frequency and Contrastive Learning-Driven Network for Document Image Shadow Removal
Yang, Fan; Jiang, Nan Feng*; Wu, Yu; Wang, Da Han
T-LLaVA: An Effective Saliency-aware Slicing Strategy for Text Recognition
Wei, Mengze*; Yang, Chun; Liang, Min; Zhou, Fang; Zhu, Xiaobin; Yin, Xu-Cheng
Computer-Aided Multi-Stroke Character Simplification by Stroke Removal
Ishiyama, Ryo*; Matsuo, Shinnosuke; Uchida, Seiichi
A Unified Model for Paragraph and Line-level Handwritten Text Recognition
Chew, Ernest*; Kong, Adams; Lim, Joo Hwee
PACM: Position-Aware Cross-Modality Decoder for Handwritten Mathematical Expression Recognition
Li, zeng*; Wei, Jin; Shen, Zhijie; Ma, Can; Wu, Yaqiang; Zhou, Yu
Total Disentanglement of Font Images into Style and Character Class Features
Haraguchi, Daichi*; Shimoda, Wataru; Yamaguchi, Kota; Uchida, Seiichi
Template-Guided Cascaded Diffusion for Stylized Handwritten Chinese Text-Line Generation
Wang, Honglie*; Ren, Minsi; Zhang, Yan-Ming; Yin, Fei; Liu, Cheng-Lin
FSTDiff: One-Shot Font Generation via Cross-Font Style Transformation Learning
Li, Shilin; Zhu, Anna*
InfoDesignLM: An LLM for Interactive and Controllable Infographic Designing Through Text
Zhang, Xilin; Wang, Hao; Dai, Jianbiao; Zhu, Pinpin*
Federated Unlearning with Clustered Asynchronous Aggregation and Ensemble Learning for Privacy-Preserving Document Analysis
Ali, Ahmad Sarmad; Moetesum, Momina*; Shafait, Faisal; Ul Hasan, Adnan
HiDReader: Human-Inspired Document Reading Agent via Reinforcement Learning
wang, changqing; wang, hao*; zhu, pinpin; zhang, huiran
CM1 - A dataset for evaluating few-shot information extraction with Large Vision Language Models
Wolf, Fabian*; Tueselmann, Oliver; Matei, Arthur; Hennies, Lukas; Rass, Christoph; Fink, Gernot A.
A Novel Multi-Modal Dataset and Method for Handwritten Signature Recognition with Image-Audio Fusion
Li, Qixiang; Ablat, Xirali; Lin, Xiaoya; Muhammat , Mahpirat; Ubul, Kurban*
LDTR: Linear Object Detection Transformer for Accurate Graph Generation by Learning the N-hop Connectivity Information
Duan, Weiwei*; Chiang, Yao-Yi; Knoblock, Craig
IndicDLP: A Foundational Dataset for Multi-Lingual and Multi-Domain Document Layout Parsing
Nath, Oikantik; Kukkala, Sahithi; Khapra, Mitesh; Sarvadevabhatla, Ravi Kiran*
SPS-CG: Shape, Pronunciation, and Semantic Joint Modeling for Chinese Character Generation
Xue, Mobai*; Du, Jun; Hu, Pengfei
MATATA: weakly supervised end-to-end MAthematical Tool-Augmented reasoning for Tabular Applications
Vinayagame, Vishnou*; Senay, Gregory; Martí, Luis
Towards the Influence of Text Quantity on Writer Retrieval
Peer, Marco*; Sablatnig, Robert; Kleber, Florian
UniLayDet: Simple Multi-dataset Document Layout Analysis
Srikumar S, Prasidh*; Mondal, Ajoy; Jawahar, C V
AI-Generated Lecture Slides for Improving Slide Element Detection and Retrieval
Maniyar, Suyash*; Trivedi, Vishvesh; Mondal, Ajoy; Mishra, Anand; Jawahar, C V
Personality Trait Prediction from Twitter Data (X) Based on Image, Text, and Image-Text Features
Biswas, Kunal; Palaiahnakote, Shivakumara *; Pal, Umapada ; Lopresti, Daniel ; Lu, Tong
A Lightweight Context-Driven Training-Free Network for Scene Text Segmentation and Recognition
Chakraborty, Ritabrata ; Palaiahnakote, Shivakumara *; Pal, Umapada ; Liu, Cheng-Lin
SANet: Multi-Scale Dynamic Aggregation for Chinese Handwriting Recognition
Wang, Sizhu; Shen, Yingshan*; Yuan, Xiaofeng
From Conversations to Insights: A Multimodal Approach to Discussion Summarization
singh, punit*; Kumar, Nishant; Mehta, Hrushik; Saha, Sriparna
Selective Forgetting in Document Images using Enhanced Ensembles
Mashhood, Muhammad ; Moetesum, Momina*; Shafait, Faisal; Ul-Hasan, Adnan
LIGHT: Multi-Modal Text Linking on Historical Maps
Lin, Yijun*; Olson, Rhett; Wu, Junhan; Chiang, Yao-Yi; Weinman, Jerod
From Scribbles to Text: A Novel Transformer-based Recognition Model for Child Handwriting
Rangasrinivasan, Sahana*; Suresh M. S., Sumi; Setlur, Srirangaraj; Jayaraman, Bharat; Govindaraju, Venu
SemSyn-LCE: An charge prediction method based on semantic syntactic fusion and legal constituent elements matching
Chen, Wenjun*; Du, Bianxia; Xia, Wenhui; HU, Qiao; Hu, Yupeng
Poster Presentations
Hypergraph-Driven Tabular Data Synthesis with Multi-Objective Optimization
Ouyang, Jin; Jia, Xu; Xiang, Sheng*; Zhang, Ying; Qin, Lu
Multimodal Content Alignment with LLM for Visual Presentation of Papers
Hu, Huiying*; He, Zhicheng; Zhou, Yixiao; Zhang, Tongwei; Lyu, Xiaoqing
Class-Agnostic Region-of-Interest Matching in Document Images
Zhang, Demin; Lyu, Jiahao; Shen, Zhijie; Zhou, Yu*
Text Detection in Industrial Design Drawings via Multi-Dimensional Feature Fusion and Differentiable Binarization
Xie, Mingzhao; Xue, Wandong; Chen, Dongming*; Wang, Dongqi
Few-Shot Segmentation of Historical Maps via Linear Probing of Vision Foundation Models
Sterzinger, Rafael*; Peer, Marco; Sablatnig, Robert
Version 5 of the Kraken ATR Engine for the Humanities
Kiessling, Benjamin*
SSSI: Self-Prompted Segmentation of Scientific Illustrations
Wang, Tuo; Zhou, Yixiao; He, Zhicheng; Zhang, Tongwei; Zhao, Yumeng; Lyu, Xiaoqing*
MCCD: A Multi-Attribute Chinese Calligraphy Character Dataset Annotated with Script Styles, Dynasties, and Calligraphers
Zhao, Yixin*; Zhang, Yuyi; Jin, Lianwen
DocAI-TL: Structured Document Tampering Localization with DocAI Model
Li, Youjie*; Zheng, Shiqiang; Zhang, Guijia; Chen, Qifeng; Chen, Changsheng
DAFSVFND: Dual Attention Fusion Network for Fake News Detection on Short Video Platforms
Li, Ruofan*; Zhang, Wei; Liu, Yong
GAMeta-Prompt: A Generative Adversarial Prompt-Based Meta-Language Reasoning Enhancement Method
qin, xinshan*; hu, yan; ma, conglin; tian, maochao
TST: Tree Structured Transformer for Handwritten Mathematical Expression Recognition
XIE, Yejing*; Mouchère, Harold
Few-Shot Document Classification in Real Applications: Boosting Precision with Novelty Detection
Pham, Tri-Cong*; Coustaty, Mickaël; Joseph, Aurélie; Deloin, Gaspar; Poulain d’Andecy, Vincent; Doucet, Antoine
Watch and Act: Multi-orientation Open-set Scene Text Recognition via Dynamic Expert Routing
Liu, Chang*; Barney Smith, Elisa H.
Historic Scripts to Modern Vision: A Novel Dataset and A VLM Framework for Transliteration of Modi Script to Devanagari
Kausadikar, Harshal; Kale, Tanvi; Susladkar, Onkar*; Mittal, Sparsh
Graph Convolutional Teacher-Student Framework for Writer Inspection from Intra-Variable Handwritten Words
Priya, Kumari; Kumar, Suraj; Dey, Aritra; Adak, Chandranath*; Chattopadhyay, Soumi; Chanda, Sukalpa; Marinai, Simone
From Notes to Keys: A VR Learning Environment for Sheet Music Interpretation
Khanna, Sandeep; Saha, Atanu; Ray, Rahul Kumar; CHATTOPADHYAY, CHIRANJOY*; Patibanda, Rakesh
Unchecked and Overlooked: Addressing the Checkbox Blind Spot in Large Language Models with CheckboxQA
Turski, Michał; Chiliński, Mateusz; Borchmann, Łukasz*
Japanese Kuzushiji Font Generation Employing Differentiable Renderer
Yuan, Honghui*; Chen, Junwen; Yanai, Keiji
VMF-Net:Visual-Aware Multi-Representation Fusion Network for Artifact-Free Handwritten Mathematical Expressions Generation
Wang, Yiming; Wei , Hongxi*; Wang, Heng; Sun, Bo
Mask CoMER: Enhancing Handwritten Mathematical Expression Recognition with Masked Language Pretraining and Regularization
Phan, Nam Van Hai; Nguyen, Minh-Khoa; Nguyen, Thanh Trung; Pham, Thanh Trung; Tran, Phuong-Nam; Dang, Duc*
RefChartQA: Grounding Visual Answer on Chart Images through Instruction Tuning
Vogel, Alexander; Moured, Omar; Chen, Yufan; Zhang, Jiaming*; Stiefelhagen, Rainer
Interpretable Writer Recognition via Vectors of Locally Aggregated Characters
Raven, Tim*; Fink, Gernot; Christlein, Vincent
Entity-Dependency Memory-Enhanced Document-Level Relation Extraction
Li, Hu*
Self-HTR: A Novel Self-Supervised Handwritten Text Recognition Framework Using Generative Adversarial Networks
Koopmans, Lisa*; Dhali, Maruf; Schomaker, Lambert
Large Language Models for Online Log Parsing in AIOps
Zhang, Suqiong*; Fan, Dongyi; Liu, Yi; He, Lili; Ding, Zuohua
TABLET: TABLE structure recognition using Encoder-only Transformers
Hou, Qiyu; Wang, Jun*
E-FCOS: Enhanced Historical Text Detection with Fast Fourier Transform Denoising and Adaptive Multi-scale Fusion
Liu, Menghui; Wang, Guanghui*; Yu, Lang; Yang, Yilan; Shen, Lingfeng; Li, Heng
Relaxed syntax modeling in Transformers for future-proof license plate recognition
Meyer, Florent*; Guichard, Laurent; Coquenet, Denis; Gravier, Guillaume; Soullard, Yann; Coüasnon, Bertrand
Adaptive Radical Similarity Learning for Chinese Character Recognition
Han, Zhongyuan; Du, Jun*; Hu, Pengfei; Xue, Mobai
QUEST: Quality-aware Semi-supervised Table Extraction for Business Documents
THOMAS, Eliott*; Coustaty, Mickael; Joseph, Aurelie; Deloin, Gaspar; Carel, Elodie; Poulain d'Andecy, Vincent; Ogier, Jean-Marc
Skeleton-Guided Artistic Text Recognition
Do, Tien; Tran, Thuyen; Le, Khiem; Le, Duy-Dinh; Duc Ngo, Thanh*
DocPINN: A Neural PDE-Based Framework for Document Image Dewarping
Fan, Guangrui*
Towards Cross-modal Retrieval in Chinese Cultural Heritage Documents: Dataset and Solution
Yuan, Junyi; Zhang, Jian; Wu, Fangyu*; Lu, Huanda; Lu, Dongming; Wang, Qiufeng
StrokeNet: Unveiling How to Learn Fine-Grained Interactions in Online Handwritten Stroke Classification
Huang, Yiheng; She, Shuang; Wei, Zewei; Lin, Jianmin*; Yang, Ming; Liu, Wenyin
Multidisciplinary End-to-End Document-level Relation Extraction from Scientific Literature
Delaunay, Julien*; Tran, Hanh Thi Hong; González-Gallardo, Carlos-Emiliano; Bordea, Georgeta; Sidere, Nicolas; Doucet, Antoine; De Viron, Olivier
KD-LSRED : Knowledge Distillation for Lightweight Symbol Recognition in Engineering Diagrams
Ekeke, Ikenna*; Moreno-García, Carlos Francisco; Elyan, Eyad
EviFiVQA: A Benchmark for Evidence-Grounded Multi-Hop Reasoning in Financial VQA
Raja, Sachin*; Mondal, Ajoy; Jawahar, C.V.
LLM-Driven Medical Document Analysis: Enhancing Trustworthy Pathology and Differential Diagnosis
Kang, Lei*; Fu, Xuanshuo; Ramos Terrades, Oriol; Vazquez Corral, Javier; Valveny, Ernest; Karatzas, Dimosthenis
Beyond Memorization: Training-Free Style Mixing for Variability in Handwritten Text Generation Using Writer Embedding Injection in Pretrained Diffusion Models
Gurav, Aniket*; Chanda, Sukalpa; Krishnan, Narayanan
UniOne:A Document Parsing Dataset for Cross Task Association Modeling
Yan, Runqing*; An, Jianye
Radical Sequence Encoding with Fine-Tuned CLIP for Handwritten Chinese Character Recognition
Luo, Qiuming; Zeng, Tao*; Wei, Xuan; Kong, Chang
OracleGCD: Generalized Category Discovery for Oracle Bone Scripts
Wu, Hetao*; Li, Kunchi; Zhang, Xuyao; Wang, Qiufeng; Wang, Dahan
PerturbCTC: Improving Alignment in Scene Text Recognition with Feature Perturbation Based CTC
Li, Zeng*; Wei, Jin; Shen, Zhijie; Wu, Yaqiang; Zeng, Gangyan; Yang, Dongbao; Qiao, Zhi; Zhou, Yu
Inverse Scene Text Removal
Yoshimatsu, Takumi*; Takezaki, Shumpei; Uchida, Seiichi
SelectVision: Adaptive Vision Resolution Selection for Visual Document Understanding
He, Zhongjiang*; Yuan, Ye; Zhao, An; Fang, Han; Sun, Hao; Liang, Kongming ; Ma, Zhanyu
DevInSight: Weaving Path Development into Online Signature Verification
Shi, Yilin; Jiang, Lei; Ni, Hao; Jin, Lianwen*
OracleProtoPNet: Oracle Character Recognition with Interpretability
Liu, Xin*; Huang, Wen; Chen, Junhui; Wang, Xingyi; Peng, Jian
BiblioPage: A Dataset for Bibliographic Metadata Extraction
Kohút, Jan*; Dočekal, Martin; Hradiš, Michal
Ar-Q-former: Historical newspaper article separation based on multimodal transformer structure
Sun, Wenjun*; Girdhar, Nancy ; Hong Tran, Hanh Thi; González-Gallardo, Carlos-Emiliano; Coustaty, Mickaël; Doucet, Antoine
Practical Fine-Tuning of Autoregressive Models on Limited Handwritten Texts
Kohút, Jan*; Hradiš, Michal
Towards Scene Text Recognition in Rainy Weather Conditions
Jamwal, Anandita*; Koneti , Lalithya; Ravikiran, Manikandan; Singh, Dinesh; Saluja, Rohit
WildKhmerST: A Comprehensive Dataset and Benchmark for Khmer Scene Text Detection and Recognition in the Wild
Nom, Vannkinh*; Keo, Saly; Bakkali, Souhail; Luqman, Muhammad Muzzamil; Coustaty, Mickaël; Rusiñol, Marçal; Ogier, Jean-marc
ColorGPT: Leveraging Large Language Models for Multimodal Color Recommendation
Xia, Ding*; Inoue , Naoto ; Qiu, Qianru ; Kikuchi, Kotaro
A learning-based method for automatically determining application sequence of intersecting ink lines using photometric stereo
Liang, Yurong; Smith, Melvyn; Smith, Lyndon; Harwood, William; Clement, Simon; Zhang, Wenhao*
HisDoc DETR: Integrating Semantic Learning and Feature Fusion for Historical Document Layout Analysis
Ding, Kai*; jian, sheng; Jin, Lianwen
SigLDiff: A Signature Based Latent Diffusion Model for Forged-free Offline Signature Verification
Zhang, Mingjian; Zheng, Lidong; Chen, Jiaen; Zhang, Yichi; Zheng, Yuchen*
TableCall: Boosting Table Question Answering with Tool-Driven Training-Free LLMs
Xu, Chun-Bo*; Chen, Yi-Ming ; Li, Xiao-Hui; Yin, Fei; Liu, Cheng-Lin
SPOCNER - a workflow from OCR to spatial mapping
Koudoro-Parfait, Caroline*; Hernandez, Marceau; Dupont, Yoann; Lejeune, Gaël
MIDV-UP: a dataset of Pakistani and Iranian ID documents
Chernyshova, Yulia*; Ilyukhin, Daniil; Arlazarov, Vladimir V.
TextSAM-LoRA: Efficient Fine-tuning of Segment Anything Model for Text Detection with Low-Rank Adaptation
De la Fuente Torres, Carlos*; Sánchez Hernández, Adrián; Calvo Zaragoza, Jorge
AttentionLeak: What Does Human Attention Reveal About Information Visualisation?
Sönnichsen, Malte*; Elfares, Mayar; Wang, Yao; Küsters, Ralf; Roitberg, Alina; Bulling, Andreas
Where Layout Meets Language: Lightweight Spatial Enhancement to Large Language Models for Document Understanding
Biescas, Nil; Biswas, Sanket*; Llados, Josep; Van Landeghem, Jordy
Table Detection with Active Learning
Gautam, Somraj*; Purohit, Nachiketa; Harit, Gaurav
A Multimodal Evaluation Pipeline for Mathematical Expression Recognition: Comparisons of Datasets, Metrics, and Models
Wieckowiak, François*; Eglin, Véronique; Bonnet, Tony; Bres, Stéphane; Rousseau, Laëtitia
DocForgeNet: Dual Cross-Stream Fusion Network for Robust Forgery Detection in Scanned Documents
Riaz, Nauman*; Ahmed, Sheraz; Agne, Stefan; Dengal, Andreas
Classifying the Unknown: In-Context Learning for Open-Vocabulary Text and Symbol Recognition
Simon, Tom*; Mocaer, William; Tranouez, Pierrick; Chatelain, Clément; Paquet, Thierry
Multi-Task Learning for Hebrew Paleography: Script Classification and Date Estimation
Atamni, Nour*; Madi, Boraq; Vasyutinsky Shapira, Daria ; Rabaev, Irina; El-Sana, Jihad; Boardman, Shoshana
HoLoSig: Holistic and Local Representation Learning for Online Signature Verification
Almeida, João Pedro*; Macedo, Lucas; Garcia Freitas, Pedro
ExamCleaner: Examination-Paper Handwritten Text Erasure via Large Receptive Field Context Anchor Attention
Hu, Lei; Liu, Dongwei; Chen, Yujia ; Wang, Zhenwei; Li, Yamin*
Optimizing Chart Image Classification: A Study of Data Augmentation and Training Strategies
Knize, Josh*; Davila, Kenny
Attend to what I say: Highlighting relevant content on slides
Mariam K M, Megha*; Jawahar, C. V.
SepFormer: Coast-to-fine Separator Regression Network with Transformer for Table Structure Recognition
Nguyen, Quan; Xuan Phong, Pham; Tuan Anh, Tran*
SFRD: Handwritten Mathematical Expressions Generation by Spatial-Aware Feature Refinement Diffusion
Wang, Yiming; Wei, Hongxi *; Wang, Heng; Sun, Shiwen
Fusing Text Semantics for Text Image Inpainting
Han, Conghao; Duan, Mengyang; Zhu, Anna*
DP-DocLDM: Differentially Private Document Image Generation using Latent Diffusion Models
Saifullah, Saifullah*; Agne, Stefan; Dengel, Andreas; Ahmed, Sheraz
Evaluating Compliance with Visualization Guidelines in Diagrams for Scientific Publications Using Large Vision Language Models
Rückert, Johannes*; Bloch, Louise; Friedrich, Christoph M.
A Unified Framework for Knowledge-Intensive Numerical Reasoning over Financial Document
Yin, Long*; Yin, Kai; Zhao, Hui
DocAnnot - Accelerating the Creation of Key Information Extraction Datasets with GenAI-Powered Auto-Annotation
Nareddy, Siddartha Reddy*; P M , Harikrishnan; S, Goutham Vignesh ; V, Varun; Vaddina, Vishal
Att-BiGRU-MulCNN:A New Approach for Intent classification in Apple Pest and Disease
Lvwen, Huang; Yuanshuang, Miao; Yong, Liu
A New Fourier-Attention Guided Approach for Domain-Agnostic Text Localization
Halder, Arnab ; Palaiahnakote, Shivakumara *; Pal, Umapada ; Blumenstein, Michael ; Lu, Yue
KIEval: Evaluation Metric for Document Key Information Extraction
Khang, Minsoo*; Jung, Sangchul; Park, Sungrae; Hong, Teakgyu
Adapting Vision-Language Models for Hindi OCR
Bhattacharyya, Shaon; Ghosh, Souvik; Deb, Prantik; Mondal, Ajoy*; Jawahar, C V
Automated Recognition and Scoring of Handwritten Short Answer: Insights from Japanese Elementary and Junior High Schools
Nguyen, Hung Tuan*; Truong, Thanh-Nghia; Ly, Nam Tuan; Nakagawa, Masaki; Horie, Toshihiko
Challenges in Revealing Readable Text from Fragments Hidden in Book Bindings: A Case Study from the Herlufsholm Collection
Pourahmadi, Baharan*; Andersen, Morten Sielnik; Holck, Jakob Povl; Jensen, Mogens Kragsig; Frandsen, Mads Toudal; Eriksen, René Lynge
Improving OCR using internal document redundancy
Belzarena, Diego; Mowlavi, Seginus*; Mariño, Camilo; Artola, Aitor; Gardella, Marina; Ramírez, Ignacio; Tadros, Antoine; He, Roy; Bottaioli, Natalia; Rajaei, Boshra; Randall, Gregory; Morel, Jean-Michel
Scene Script Identification using Dense Hierarchical Semantic Fusion
Yang, Yaowei; Sun, Zhonghua; Tuerxun, Kaisaier; Ubul, Kurban*
A Handwritten Text Recognition Dataset for Ajami Manuscripts in Fulfulde and Hausa
Yousuf, Oreen*; Aminu, Abdulmalik; Muhammad, Musa; Usman, Bashir; Kurfi, Mustapha; Nivre, Joakim; Megyesi, Beáta; Høgel, Christian
VLMAWR: A Method for Manchu Archives Word Recognition Based on Vision-Language Model
Zhou, Yu; Jin, Zhengxu; He, Jianjun; Wu, Baochun; Cui, Xinshu; ZHENG, RUIRUI*
A Unified Attention-Based Model for Segmenting Compound Words in Sanskrit
Ali, Irfan*; Lo Presti, Liliana; Spano’, Igor; La Cascia, Marco
MultKMAWE:A Multi Key-points based Word Extraction Algorithm for Manchu Archives
He, Jianjun; Ma, Xiaoxuan; Tian, Tian; Zhou, Yu; Liu, Wenpeng; ZHENG, RUIRUI*
Towards Understanding the Logical Layout of Scene Text in Signboard Images
Tran, Giang*; Tran Nhu Cam, Nguyen; Tran Doan, Thuyen; Duc Ngo, Thanh
Verification of Dynamic Holographic Behavior in Identity Documents
Pouliquen, Glen*; Chazalon, Joseph; Chiron, Guillaume; Géraud, Thierry; Awal, Ahmad Montaser
Doc2GraphFormer: Exploring the Synergy of Graph Representations and Transformer Attention for Structured Document Understanding
Mazumder, Souparni; Biswas, Sanket*; Pal, Aniket; Das, Alloy; Pal, Umapada; Llados, Josep
Position-aware Stamp-like Adversarial Attack for Document Classification
Dong, Qi*; Kang, Lei; Pintor, Maura; Karatzas, Dimosthenis
Evaluating Handwritten Text Recognition in Medieval Notarial Manuscripts: a New Dataset and Comprehensive Analysis
Coll Ardanuy, Mariona*; Berganzo-Besga, Iban; Sarobe, Ramon; Cuadrada, Coral
Rx-PAD : Recognition and eXtraction - a dataset for Prescription Analysis and clinical Data structuring
Pattin-Cottet, jonathan*; Eglin, Véronique; Aussem, Alexandre
KuiSCIMA v2.0: Improved Baselines, Calibration, and Cross-Notation Generalization for Historical Chinese Music Notations in Jiang Kui’s Baishidaoren Gequ
Repolusk, Tristan*; Veas, Eduardo
Uncertainty-Aware Complex Scientific Table Data Extraction
Ajayi, Kehinde*; He, Yi; Wu, Jian
CHSAM: Efficient Scene Text Segmentation via SAM with Convolutional Adapters and Hierarchical Decoding
Zhang, JingYao; Zhang, Heng*; Yin, Fei
HiLEx : Image-based Hierarchical Layout Extraction from Question Papers
Aich, Utathya; Chakraborty, Shinjini; Sadhukhan, Dipan ; Ghosh, Swarnendu; Saha, Tulika*
Optimizing Thai-English Spoken Question Answering Interaction for Open Environments with Limited Resources
Singkul, Sattaya*; Petchsod, Atthakorn; Sunantasaengtong, Panya; Sakdejayont, Theerat; Chalothorn, Tawunrat