Start · Paper List (Normalized: Citations/Year) · Papers/Citations per Year (Plot) · Names in Top-h5 · Person Citations per Year · Top-h5 Papers per Person
Pos Paper Citations Year
1 PySpark and RDKit: moving towards big data in cheminformatics 139 2019
2 How to keep text private? A systematic review of deep learning methods for privacy-preserving natural language processing 127 2023
3 Machine learning in continuous casting of steel: A state-of-the-art survey 126 2022
4 External and intrinsic plagiarism detection using vector space models 118 2009
5 Understanding the true effects of the COVID-19 lockdown on air pollution by means of machine learning 116 2021
6 Why do users tag? Detecting users’ motivation for tagging in social tagging systems 96 2010
7 Machine learning in prediction of intrinsic aqueous solubility of drug‐like compounds: Generalization, complexity, or predictive ability? 90 2021
8 Of categorizers and describers: An evaluation of quantitative measures for tagging motivation 88 2010
9 Recent advances of differential privacy in centralized deep learning: A systematic survey 84 2025
10 Understanding why users tag: A survey of tagging motivation literature and results from an empirical study 71 2012
11 Authorship identification of documents with high content similarity 63 2018
12 External and intrinsic plagiarism detection using a cross-lingual retrieval and segmentation system 62 2010
13 Evaluation of folksonomy induction algorithms 55 2012
14 Mesh-free surrogate models for structural mechanic FEM simulation: A comparative study of approaches 55 2021
15 Aspects of broad folksonomies 54 2007
16 Deep learning—a first meta-survey of selected reviews across scientific disciplines, their commonalities, challenges and research impact 53 2021
17 Establishing and evaluating trustworthy AI: overview and research challenges 52 2024
18 Big data as a promoter of industry 4.0: Lessons of the semiconductor industry 49 2017
19 Constructing robust health indicators from complex engineered systems via anticausal learning 49 2022
20 Assessing trustworthy AI: Technical and legal perspectives of fairness in AI 48 2024
21 Teambeam-meta-data extraction from scientific literature 48 2012
22 Unsupervised document structure analysis of digital scientific articles 48 2014
23 Formula rl: Deep reinforcement learning for autonomous racing using telemetry data 47 2021
24 A Literature Survey of Early Time Series Classification and Deep Learning 46 2016
25 Daphne: An open and extensible system infrastructure for integrated data analysis pipelines 45 2022
26 Saga++: A scalable framework for optimizing data cleaning pipelines for machine learning applications 45 2026
27 Should we embed in chemistry? A comparison of unsupervised transfer learning with PCA, UMAP, and VAE on molecular fingerprints 44 2021
28 Polarity classification for target phrases in tweets: a Word2Vec approach 41 2016
29 Theory-inspired machine learning—towards a synergy between knowledge and data 41 2022
30 Ubiquitous access to digital cultural heritage 40 2017
31 Analysis of structural relationships for hierarchical cluster labeling 38 2010
32 Predicting treatment outcomes using explainable machine learning in children with asthma 38 2021
33 Recommending tags for pictures based on text, visual content and user context 38 2008
34 A comparison of two unsupervised table recognition methods from digital scientific articles 37 2014
35 An unsupervised machine learning approach to body text and table of contents extraction from digital scientific articles 35 2013
36 Efficient linear text segmentation based on information retrieval techniques 35 2009
37 Information extraction from German radiological reports for general clinical text and language understanding 35 2023
38 A historical perspective of biomedical explainable AI research 34 2023
39 Detection of Abusive Speech for Mixed Sociolects of Russian and Ukrainian Languages 33 2018
40 Reconsidering read and spontaneous speech: Causal perspectives on the generation of training data for automatic speech recognition 33 2023
41 QZTool—automatically generated origin-destination matrices from cell phone trajectories 32 2016
42 Adversarial inter-group link injection degrades the fairness of graph neural networks 31 2022
43 Feature extraction from analog wafermaps: A comparison of classical image processing and a deep generative model 30 2019
44 Using factual density to measure informativeness of web documents 30 2013
45 Deriving public transportation timetables with large-scale cell phone data 28 2015
46 Identifying referenced text in scientific publications by summarisation and classification techniques 26 2016
47 Map-matching cell phone trajectories of low spatial and temporal accuracy 26 2015
48 Structack: Structure-based adversarial attacks on graph neural networks 25 2021
49 Gaussian process surrogates for modeling uncertainties in a use case of forging superalloys 24 2022
50 SAZED: parameter-free domain-agnostic season length estimation in time series data 24 2019
51 A comparison of layout based bibliographic metadata extraction techniques 23 2012
52 Improving the consistency of the failure mode effect analysis (FMEA) documents in semiconductor manufacturing 23 2022
53 Enhancing OCR in historical documents with complex layouts through machine learning 22 2025
54 Knowledge discovery using the KnowMiner framework 22 2009
55 Extraction of references using layout and formatting information from scientific articles 21 2013
56 Predictive capability of QSAR models based on the CompTox zebrafish embryo assays: An imbalanced classification problem 21 2021
57 Towards a More Fine Grained Analysis of Scientific Authorship: Predicting the Number of Authors Using Stylometric Features 21 2016
58 Vote/veto meta-classifier for authorship identification notebook for PAN at CLEF 2011 20 2011
59 Machine learning techniques for automatically extracting contextual information from scientific publications 19 2015
60 Crowdsourcing fact extraction from scientific literature 18 2013
61 Ensemble machine learning, deep learning, and time series forecasting: improving prediction accuracy for hourly concentrations of ambient air pollutants 18 2024
62 Improving FMEA comprehensibility via common-sense knowledge graph completion techniques 18 2023
63 Towards Authorship Attribution for Bibliometrics using Stylometric Features 18 2015
64 Body mass index, body image dissatisfaction, and eating disorder symptoms in female aquatic sports: Comparison between artistic swimmers and female water polo players 17 2020
65 Extending folksonomies for image tagging 17 2008
66 Improving OCR quality in 19th century historical documents using a combined machine learning based approach 17 2024
67 Know-center at semeval-2019 task 5: multilingual hate speech detection on twitter using cnns 17 2019
68 Exploiting propositions for opinion mining 16 2016
69 Self-and cross-excitation in stack exchange question & answer communities 16 2019
70 A causality-inspired approach for anomaly detection in a water treatment testbed 15 2022
71 A comparison of supervised approaches for process pattern recognition in analog semiconductor wafer test data 15 2018
72 Astro-and geoinformatics–visually guided classification of time series data 15 2020
73 Distributed Web2. 0 crawling for ontology evolution 15 2007
74 Lessons learned from the 1st Ariel Machine Learning Challenge: Correcting transiting exoplanet light curves for stellar spots 15 2023
75 Parasitic resistance as a predictor of faulty anodes in electro galvanizing: a comparison of machine learning, physical and hybrid models 15 2020
76 Effective use of BERT in graph embeddings for sparse knowledge graph completion 14 2022
77 GerIE-An Open Information Extraction System for the German Language 14 2018
78 KCDC: Word sense induction by using grammatical dependencies and sentence phrase structure 14 2010
79 Treatment outcome clustering patterns correspond to discrete asthma phenotypes in children 14 2021
80 Reconstructing the logical structure of a scientific publication using machine learning 13 2016
81 Unleashing semantics of research data 13 2012
82 Chatbots assisting German business management applications 12 2019
83 A generative semi-supervised classifier for datasets with unknown classes 11 2020
84 Is enterprise search useful at all? Lessons learned from studying user behavior 11 2014
85 Cluster purging: Efficient outlier detection based on rate-distortion theory 10 2021
86 Detecting outliers in non-iid data: A systematic literature review 10 2023
87 Driver's dashboard–using social media data as additional information for motorway operators 10 2018
88 Exploration of transfer learning techniques for the prediction of PM 10 2025
89 Extending Scientific Literature Search by Including the Author's Writing Style 10 2017
90 Activity archetypes in question-and-answer (q8a) websites—a study of 50 stack exchange instances 9 2019
91 An embedding approach for microblog polarity classification 9 2017
92 citation needed: Filling in Wikipedia's Citation Shaped Holes 9 2014
93 Large language models for fault detection in buildings’ HVAC systems 9 2024
94 Long short-term memory networks for enhancing real-time flood forecasts: a case study for an underperforming hydrologic model 9 2025
95 Recommending scientific literature: Comparing use-cases and algorithms 9 2014
96 Text representation for efficient document annotation 9 2013
97 Vote/Veto Classification, Ensemble Clustering and Sequence Classification for Author Identification 9 2012
98 AI-Based Knowledge Management System for Risk Assessment and Root Cause Analysis in Semiconductor Industry 8 2022
99 Ein industrie 4.0-use case in der motorenproduktion 8 2018
100 Ensemble methods 8 2016
101 Large language models for electronic health record de-identification in english and german 8 2025
102 The interplay between communities and homophily in semi-supervised classification using graph neural networks 8 2021
103 Using the open meta kaggle dataset to evaluate tripartite recommendations in data markets 8 2019
104 Vote/veto classification, ensemble clustering and sequence classification for author identification-Notebook of PAN at CLEF 2012 8 2012
105 Exploring the capabilities of gpt4-vision as ocr engine 7 2024
106 Exploring the influence of tagging motivation on tagging behavior 7 2010
107 Graz University of Technology at CL-SciSumm 2017: Query Generation Strategies 7 2017
108 Interpretability of causal discovery in tracking deterioration in a highly dynamic process 7 2024
109 KnowMiner: Ein service orientiertes knowledge discovery framework 7 2006
110 ManEx: The visual analysis of measurements for the assessment of errors in electrical engines 7 2022
111 Privacy in open search: A review of challenges and solutions 7 2021
112 Solving multi-objective inverse problems of chained manufacturing processes 7 2023
113 Stylometric watermarks for large language models 7 2024
114 Understanding wafer patterns in semiconductor production with variational auto-encoders 7 2018
115 A health factor for process patterns enhancing semiconductor manufacturing by pattern recognition in analog wafermaps 6 2019
116 An Information Retrieval Based Approach for Multilingual Ontology Matching 6 2016
117 A study of scientific writing: Comparing theoretical guidelines with practical implementation 6 2014
118 Enhanced Active Learning of Convolutional Neural Networks: A Case Study for Defect Classification in the Semiconductor Industry 6 2020
119 Know-Center at PAN 2015 author identification 6 2015
120 Markov random fields for pattern extraction in analog wafer test data 6 2017
121 Model selection strategies for author disambiguation 6 2011
122 Source selection of long tail sources for federated search in an uncooperative setting 6 2018
123 Using ontologies for software documentation 6 2009
124 Addressing hallucination in causal q&a: The efficacy of fine-tuning over prompting in llms 5 2025
125 Crosslanguage retrieval based on wikipedia statistics 5 2008
126 Effects of class imbalance countermeasures on interpretability 5 2024
127 Evaluation of pseudo relevance feedback techniques for cross vertical aggregated search 5 2015
128 German encyclopedia alignment based on information retrieval techniques 5 2010
129 Grammar Checker Features for Author Identification and Author Profiling 5 2013
130 Impact of training instance selection on domain-specific entity extraction using BERT 5 2022
131 KnCe2013-CORE: Semantic Text Similarity by use of Knowledge Bases 5 2013
132 Opinion mining with a clause-based approach 5 2017
133 PyChemFlow: an automated pre-processing pipeline in Python for reproducible machine learning on chemical data 5 2023
134 A formally robust time series distance metric 4 2020
135 A semantic federated search engine for domain-specific document retrieval 4 2017
136 Cyber-Physical Systems as Enablers in Manufacturing Communication andWorker Support 4 2019
137 Efficient table annotation for digital articles 4 2015
138 Ensemble watermarks for large language models 4 2025
139 Flexible scheduling for human robot collaboration in intralogistics teams 4 2018
140 KOMPOS: Connecting causal knots in large nonlinear time series with non-parametric regression splines 4 2021
141 On the impact of communities on semi-supervised classification using graph neural networks 4 2020
142 Profiling microblog authors using concreteness and sentiment 4 2016
143 Towards a marketplace for the scientific community: accessing knowledge from the computer science domain 4 2014
Imprint