Documents -

Self-Supervised Learning: Redefining the Future of Data-Efficient AI

Andrei McCall

June 20, 2025

Self-Supervised Learning (SSL) has rapidly emerged as a transformative approach in the landscape of artificial intelligence, particularly in the context of data efficiency. Unlike traditional supervised learning paradigms, which rely heavily on vast amounts of labeled data, SSL enables models to learn valuable representations from raw, unlabeled data through pretext tasks. This paper explores the theoretical underpinnings, architectures, and real-world applications of SSL, highlighting its potential to revolutionize domains such as computer vision, natural language processing, and reinforcement learning. We analyze recent advances, including contrastive and generative methods, and examine how SSL integrates with state-of-the-art models like BERT, BYOL, and wav2vec 2.0. Additionally, we discuss key challenges and future research directions, including SSL's role in low-resource environments and its intersection with causal inference and federated learning. Through comprehensive analysis and critical synthesis, we argue that SSL is not just a method but a paradigm shift towards more scalable, efficient, and generalized AI systems.

AI for Scientific Discovery: Automating Hypothesis Generation

Andrei McCall

and 1 more

June 20, 2025

The accelerating pace of scientific advancement has underscored the need for innovative tools to assist researchers in navigating the ever-growing volumes of data and literature. Artificial Intelligence (AI) has emerged as a transformative force in this domain, particularly in automating the process of hypothesis generation-a core component of scientific discovery. This paper explores the current capabilities and methodologies of AI systems designed to formulate plausible, testable scientific hypotheses by synthesizing patterns, identifying anomalies, and integrating multi-domain knowledge. Emphasis is placed on machine learning algorithms, knowledge graphs, and natural language processing tools that mimic cognitive reasoning. Through case studies in biomedical and material science fields, we demonstrate the potential of these systems to accelerate discovery and enhance research efficiency. The study concludes by addressing the epistemological and ethical implications of delegating aspects of scientific reasoning to machines and offers a roadmap for future advancements in hybrid human-AI discovery systems.

A rare case of very late implantable cardioverter-defibrillator lead perforation

Orlaith Casey

and 3 more

June 16, 2025

Implantable cardioverter-defibrillator (ICD) implantation is a common medical procedure that carries many serious risks including lead perforation. The incidence of ICD lead perforation has been estimated at 0.6-5.2% previously[(1)](#ref-0001). Risk factors for lead perforation include older age, female sex, worsened New York Heart Association (NYHA) heart failure class and nonischaemic cardiomyopathy[(2)](#ref-0002). Lead perforation can present acutely with symptoms related to pericardial effusion or in a delayed fashion, where they are commonly asymptomatic [(3)](#ref-0003). Our case describes a patient presenting with acute pericardial-type symptoms many months after ICD implantation and highlights the importance of recognition of lead perforation symptoms at any time following device implantation

Integrative Bioinformatics Analysis Identifies Peripheral Blood Hub Genes and Establi...

Yang Liu

and 6 more

June 16, 2025

Background: Long COVID (LC) is a multisystem condition with symptoms persisting ≥3 months after SARS-CoV-2 infection. Objective biomarkers are lacking, and its immune and molecular mechanisms remain unclear. Methods: We integrated five GEO peripheral blood transcriptome datasets (n = 1,717) and applied ComBat batch correction. Differentially expressed genes (DEGs; |log 2FC| > 1, FDR < 0.05) were identified using limma. GO and KEGG enrichment analyses were performed with clusterProfiler. Hub genes were defined via STRING and Cytoscape algorithms. miRNA–mRNA interactions were predicted by miRDB, TargetScan, and miRTarBase. A logistic regression model based on five core genes (CXCR2, CXCR1, JUN, CXCL8, SELPLG) was evaluated by ROC curves, calibration, and decision analysis. Immune cell infiltration was estimated with CIBERSORT, and Reactome GSEA was conducted using fgsea. Results: We identified 289 DEGs enriched in neutrophil degranulation, immune response, and coagulation pathways. Seven hub genes emerged: TLR8, FCGR2A, CXCR2, CXCR1, JUN, CXCL8, and SELPLG. The five-gene model achieved AUC > 0.92 across all cohorts. LC samples showed increased M1 macrophages, neutrophils, and activated dendritic cells, with decreased Tregs, CD8 + T cells, and M2 macrophages. GSEA confirmed dysregulation of innate immunity and coagulation. Conclusion: This large-scale integrative analysis reveals immune and coagulation disturbances in LC, identifies key diagnostic genes and miRNA networks, and establishes a robust five-gene model for LC detection.

Ensuring Ethical Health Worker Recruitment by Private Agencies: Comparative Insights...

Mukul Bakhshi

and 6 more

June 16, 2025

Abstract Private recruitment agencies are increasingly central to the international migration of health workers, especially nurses, amid a global health workforce shortage. This paper examines how private recruitment practices align with the World Health Organization’s Global Code of Practice on the International Recruitment of Health Personnel (WHO Code) through comparative case studies from the United States, United Kingdom, and Germany—three major destination countries for migrant health professionals. It explores each country’s healthcare system, regulatory landscape, and mechanisms for ethical recruitment, with a special focus on voluntary certification schemes and national compliance programs. The paper also includes insights from the Philippines, a major source country, on its efforts to safeguard migrating health workers. Despite variations in governance, a common theme across these nations is the difficulty in monitoring and enforcing ethical standards, especially when participation in regulatory frameworks is voluntary. The U.S. faces challenges due to minimal federal oversight and legally permissible, yet ethically questionable, practices. The UK and Germany, by contrast, have incentivized adherence through funding mechanisms and legal frameworks. All three countries struggle to track the impact of active recruitment from WHO-designated vulnerable nations, raising concerns about the sustainability of source country health systems. The paper concludes that ethical recruitment is essential for maintaining both workforce integrity in destination countries and healthcare capacity in source countries. Strengthened multilateral oversight, transparency, and stakeholder cooperation are vital to advancing the WHO Code and protecting migrant health workers globally.

Impact of Sodium-Glucose Cotransporter 2 Inhibitor on Recurrence After Catheter Ablat...

Yang Xu

and 14 more

June 16, 2025

Aims: The impact of sodium-glucose cotransporter-2 inhibitors (SGLT2i) on atrial fibrillation (AF) recurrence after catheter ablation is still inconclusive. Besides, their efficacy on AF recurrence stratified by metabolic syndrome (MetS) status remains unknown. Methods: Patients with AF undergoing initial catheter ablation between January 2017 and December 2023 from the China-AF Registry were included. Patients were 1:1 propensity score-matched by SGLT2i use at discharge and stratified by baseline MetS status. The main outcome was the AF recurrence after a 3-month blanking period. Results: After propensity score matching, 573 patients in the SGLT2i group and 573 in the non-SGLT2i group were included in the study. During the 20.5 ± 13.7 months follow-ups, AF recurrence occurred in 100 patients (17.5%) in the SGLT2i group and 168 patients (29.3%) in the non-SGLT2i group. SGLT2i was associated with lower AF recurrence (17.5% vs. 29.3%; HR 0.59, 95% CI 0.46–0.75, P<0.001), with consistent benefits in MetS (HR 0.61, 95% CI 0.39–0.75, P=0.03) and non-MetS subgroups (HR 0.58, 95% CI 0.43–0.78, P<0.001, P interaction=0.841). The effect of SGLT2i on the AF recurrence also remained consistent across the Body mass index (BMI) spectrum ( P interaction=0.740). Conclusion: SGLT2i was associated with a lower risk of AF recurrence after catheter ablation independently of MetS status or BMI spectrum.

Research on the New MultiModuel-Seq2Seq Model: Empowering River Water Level Predictio...

Luo Youxi

and 4 more

June 16, 2025

River water level prediction plays a crucial role in preventing flood disasters, safeguarding lives and property, and supporting water resource management and ecological environment protection. This study focuses on innovatively designing a multi-module collaborative water level prediction model to enhance the accuracy and adaptability of river water level prediction.Based on the LSTM-Seq2Seq model, this paper attempts to optimize and improve the model by introducing certain mechanisms and modules, ultimately obtaining the MultiMod-Seq2Seq model. Firstly, the attention mechanism is introduced to enhance the efficiency of extracting key features. Then, the Auto-Correlation module is integrated to analyze the self-correlation characteristics of the sequence. Subsequently, the ATFNet module is utilized to integrate time and frequency domain feature information and fully explore the characteristics of water level data at different frequencies. The Decomposition module is also employed to decompose the data into periodic and trend components, aiming to enhance the model’s adaptability and prediction accuracy in response to complex water level changes. To systematically evaluate the model’s performance, this study selects three datasets with diverse features for comparative experiments, all in the form of time series, including strongly periodic, non-periodic, and mixed datasets. The results show that the model performs well in strongly periodic datasets. In the experiment on the river flow of a certain basin in the Yangtze River using GRDC global runoff data, the MAE and RMSE are reduced by 33.9% and 21.8% respectively compared to the LSTM-Seq2Seq model, and the PeriodScore is increased by 30.0%. The model also performs well in mixed datasets, with better performance than traditional and emerging models. The MAE and RMSE are reduced by 21.6% and 21.1% respectively compared to the LSTM-Seq2Seq model, and the PeriodScore is increased by 30.9%. In the ablation experiment, the water level data of a certain node in the Yangtze River Basin’s hydrological monitoring station is used, with missing values filled by mean imputation. This helps to explore the contribution of each module to the performance improvement of the LSTM-Seq2Seq model.

Urban Emancipations in Practice: Rethinking Urban Agency and Resistance from Below

Bochra hadj kilani

June 20, 2025

This review examines the collective volume Pratiques d'émancipation urbaine, edited by Claire Carriou, Pauline Guinard, and Martin Olivera, as a significant contribution to contemporary urban theory. Challenging dominant narratives in the fields of urban governance and critical urbanism, the authors explore situated and often invisible forms of emancipation unfolding across diverse socio-spatial contexts. The present volume undertakes a comprehensive analysis of empirical case studies, which are distributed across Europe and the Global South. The objective of this study is to capture the manner in which marginalised actors enact autonomy, resist normative spatial orders, and reshape urban citizenship from the ground up. This review underscores the significance of the work's. The conceptual originality of the work is evident, as is its interdisciplinary scope and methodological rigour. Furthermore, it engages critically with the theoretical tensions underpinning the study and their implications for future urban research.

A convex variational formulation for a non-convex model through the Galerkin function...

Fabio Botelho

June 20, 2025

This short communication develops a convex variational formulation for an originally nonconvex primal one through the Galerkin functional. The results are based on standard tools of calculus of variations and optimization theory in Banach spaces.

The burden of chronic obstructive pulmonary disease and its attributable risk factors...

Leul Mekonnen Nigatu

and 7 more

June 20, 2025

List of TablesTable 1: Description of health statesTable 2: Age standardized rate and percentage change of all-age COPD prevalence and COPD attributable YLDs in 2021 by Sub-Saharan Africa countryTable 3: COPD attributable deaths and YLLs in 2021 and percentage change from 1990 to 2021 across Sub-Saharan Africa country

Evidence for the plant apparency and Janzen Connell hypotheses in a subtropical fores...

Gang Zhou

and 5 more

September 03, 2025

documentclassarticle Abstract We prove results on unique continuation at the boundary for the solutions of real analytic elliptic partial differential equations of the form \begin{equation} \sum_{i,j=1}^{n}a_{ij}(x)\frac{\partial^{2}u}{\partial x_{i}\partial x_{j}}+\sum_{k=1}^{n}b_{k}(x)\frac{\partial u}{\partial x_{k}}+c(x)u=0\nonumber \\ \end{equation} This work is motivated by and generalized the main results of , \cite{berhanu2021boundary},\cite{berhanu2021local}, X.Huang et al in ,\cite{huang1993unique},\cite{huang1995hopf} and M.S Baouendi and L.P. Rothschild in \cite{baouendi1993local} Key words: Elliptic partial differential equation ; Hopf Lemma ; Unique continuation principle ; Real analytic hypersurface ; Real analytic functions

Moho Beneath the Makran Zone, Iran from Satellite Gravity based on a Hybrid Metaheuri...

Sanam Hosseinzadeh

and 3 more

June 27, 2025

The Makran Subduction Zone (MSZ) has sparse seismic activity, limited accessibility, and insufficient seismic coverage, making traditional seismic methods less effective for estimating Moho depth. As a result, alternative approaches like gravity inversion are essential for Moho modeling in this region. This study utilizes gravity anomalies to model the Moho depth, employing a hybrid inversion method that combines Differential Evolution (DE) and Particle Swarm Optimization (PSO). This approach is suitable for solving high-dimensional and complex gravity inverse problems, ensuring both accuracy and computational efficiency. In the forward modeling, Bouguer gravity anomalies are attributed to an anomalous mass with a constant density between a flat reference Moho and the undulating Moho surface. Rectangular prisms in a Cartesian framework are employed to compute the gravitational field of this mass. Validation on synthetic data shows that the method effectively recovers smoothly changing Moho topographies. Application of the algorithm to the Makran region, using satellite-derived Bouguer anomalies, reveals the Moho depths range from 19-30 km beneath the Oman Sea, 30-40 km in the Makran fore-arc, and 40-45 km over the Taftan-Bazman volcanic arc.

A Heuristic Inspired by Causal Graphs with Dynamic Trace (GCTD) for Autonomous Cleani...

Andrea Signorini

June 20, 2025

A document by Andrea Signorini. Click on the document to view its contents.

Discrete recursive map neuron model

Denis I Bolshakov

and 3 more

June 20, 2025

Study of spiking neural networks is a promising scientific direction of modern interdisciplinary research. Spiking networks have demonstrated high efficacy in processing and classification tasks on various datasets (pictures, acoustic signals, biological signals) and in robotics (navigation, movement control, interaction with media etc.) One of the main problems of this type of neural networks is high computational and implementational cost. This problem rises from computational complexity of many nonlinearities in neuronal models and nonlinear synaptic functions. Novel lowcomputational-cost models of spiking neurons and synapses could be a possible decision of this problem. Here, we propose a new discrete recursive neuron model. The model demonstrates rich spiking and bursting dynamic repertoire and requires relatively small computational resources. Moreover, the proposed model could be implemented by standard discrete logic elements.

Ehrenfest Paradox and the meaning of its non-resolution

Chandru Iyer

June 16, 2025

In the special theory of relativity proper lengths of rods and rulers remain unaltered. However, non co-moving rods and rulers (appear to) contract along the line of movement. This contraction is asserted to be as real as any conceivable physical measurement that is made by the reference frame with respect to which the rods and rulers are moving. It is well recognised that the contraction is a result of mismatch in synchronization of spatially separated clocks. The Ehrenfest paradox highlights the anomalies created by (apparent) length contraction that is real for the noncomoving frame yet non-existent for the comoving frame. Ehrenfest, a reputed theoretical physicist of his times, himself did not offer a solution to the paradox, indicating that the paradox is a critique of the special relativity theory. There is no consensus on the resolution of the paradox except evasive ones such as the clocks on the circumference cannot be synchronised by any acceptable procedure or the impossibility of maintaining rigidity during the transition. The original paradox proposed by Ehrenfest, envisaged contraction of the circumference. The counterview proposed by Einstein that the rulers on the circumference contracted, leading to a measurement of a larger circumference, only exasperates the paradox. Thus the paradox remains unresolved causing doubts on the maintainability of the theory of special relativity. The difficulties that preclude the possibility of an acceptable synchronisation in the rotating frame lead to an impossibility of observing any reality, absolute or otherwise. This is because without synchronisation of spatially separated clocks, it is not possible to measure the length of a moving rod. Since there must be a reality, absolute or otherwise, we suggest that there must be a synchronisation that corresponds to reality.

A Comprehensive Machine Learning Paradigm for Space Debris Surveillance: An Integrate...

Nived Nandakumar

and 1 more

June 16, 2025

The growth of space debris in Low Earth Orbit (LEO) risks the current operation of satellites and spacecrafts. The Kessler Syndrome, in which debris is projected to grow at an exponential rate due to continuous collisions, has raised awareness and urgency about this subject. Current ground-based and machine learning approaches are subject to atmospheric distortions and are not able to track the smaller, faster pieces of debris capable of greater damage. Our research addresses this issue with three distinct, integrated machine learning models to perform debris detection, trajectory prediction, and collision risk assessment from a space-based perspective. The first model is a Convolutional Neural-Network trained on a dataset of over five-thousand images to identify and classify debris and defunct satellites at a 98% accuracy rate. The second model utilizes a Physics-Informed Neural Network incorporating a Long-Short Term Memory Model to predict the trajectories of space debris with 97% accuracy, taking into account various orbital parameters such as eccentricity and period. The third model, a Random Forest Regressor, evaluates the risk of collisions between debris and current satellites, yielding a 98% accuracy. The application of this framework is designed for space-based laser systems that reduces small debris lifespan in orbit using a technique known as ablation. This novel approach allows a space-based laser to provide precise and adaptive predictions unlike current ground based solutions. By integrating machine learning with space engineering, this study addresses a critical global issue, offering a cost and energy efficient way to mitigate the growing threat of space debris and ensure the safety of LEO operations.

Coherent Homology and a Framework for Poincaré Duality in Causal Graphs with Complex-...

Andrea Signorini

June 20, 2025

The theory of Causal Graphs with Dynamic Trace (GCTD) provides a rich framework for modeling discrete systems. However, its homological analysis was hindered by a boundary operator ∂ τ that did not satisfy ∂ τ • ∂ τ = 0. This paper resolves this issue by defining the trace in a commutative field (C) and modifying the (co)boundary operators to be weighted by individual vertices. We prove that this construction ensures ∂ Φ • ∂ Φ = 0 and, for the adjoint operator, d Φ • d Φ = 0. This establishes a coherent homology and cohomology. Building on this, we outline a complete framework for Poincaré Duality in GCTDs. We specify the necessary conditions of finiteness and orientability, define the fundamental class [G], and introduce the cap product that realizes the duality isomorphism. This work provides a mathematically sound foundation for the advanced topological analysis of GCTDs.

The Mandelbrot Set on Algebraic Varieties

Philipp Harland

June 20, 2025

In this paper, we will be doing an overview on a certain generalization of the Mandelbrot set-specifically, one to algebraic varieties defined as the zero sets of polynomials defined w.r.t. parameter spaces.

The Canonical Structure operation In more Detail, Generalizations

Philipp Harland

June 20, 2025

This paper studies some properties and generalizations of the "canonical structure" operator on strings, denoted C, which was originally introduced as part of the study of Papyrus Oxyrhynchus 90, in [Har25].

Detection of Celestial Anomalies Based on Light Curves and Machine Learning

DING YIMING

June 20, 2025

The universe harbors countless undiscovered celestial phenomena, including exoplanets, starspots, eclipsing binaries, and other brightness anomalies. With the increasing availability of high-quality photometric data from missions like TESS (Transiting Exoplanet Survey Satellite), automated methods are needed to detect subtle light variations. This study focuses on the TOI-700 system and applies the KMeans clustering algorithm to identify periodic brightness dips-most notably those caused by exoplanet TOI-700d. Compared with traditional threshold-based methods, KMeans achieves 85% accuracy and reduces processing time to 2.3 seconds per 10,000 data points. By integrating machine learning with astrophysical context, the project provides a reproducible, accessible method for high school-level researchers to contribute to exoplanetary science.

On The Connection Between (k, 3)-Borcherds Surfaces, Overpacked Lattices and Modular...

Philipp Harland

June 20, 2025

Analogously to the connection in number theory, in this paper, we establish and study the connection (or at least the most "direct" form of it) between overpacked lattices and Borcherds surfaces of index (k, 3).

Exploring Seasonal Dynamics of Landsat 8 and 9 Data on Vegetation Indices in the Moun...

Supath Dhital

and 2 more

June 20, 2025

Earth observation technologies have revolutionized our ability to monitor and assess environmental changes, offering unprecedented insights into the dynamics of ecosystems, land use, and climate patterns. As these technologies evolve, newer satellites like Landsat-9 (L9) offer refined capabilities that demand careful assessment of data quality and reliability improvements. With L9's enhanced radiometric resolution over Landsat-8 (L8), evaluating the impact on data accuracy and temporal resolution is crucial for ensuring consistency in long-term Earth observation and environmental monitoring. This study presents a unique way of evaluating the data consistency between two satellites by examining vegetation indices' seasonal performance and consistency from L8 and L9, focusing on NDVI, NDWI, EVI, and SAVI. Different evaluation metrics, as well as performance over different land use categories, are evaluated. This study shows that L8 generally overestimates vegetation index values compared to L9, except for SAVI, where both satellites perform similarly. L9 exhibits greater sensitivity to vegetation and water features, especially during transitional seasons such as spring and fall. The correlation between indices from both satellites remains robust, with R² values consistently exceeding 0.8, peaking at 0.92 in the fall. In the summer, L9's finer temporal resolution better captures rapid phenological changes, leading to increased variability compared to other seasons. Quantitative metrics indicate that L8 overestimates vegetation indices, with higher average RMSE (~0.05), MAE (~0.04), and a percentage bias of 4% compared to L9. This overestimation persists across most land use classes and seasons, with a notable ~15% negative bias in grassland and shrubland during the summer. L9's enhanced capabilities make it a more reliable tool for monitoring vegetation, particularly water features, in rapidly changing environments. Future research should explore a range of geographical and climatic contexts and incorporate additional indicators and high-resolution data, such as multi-spectral drone imagery, to strengthen validation efforts.

AI Rhythmic Equation Framework for Cognitive Memory and Response 인지 기반 리듬 응답 및 기억 구조를...

Ken Park

June 20, 2025

This paper presents a set of AI equations designed to represent cognitive rhythm and structured memory responses. Each equation corresponds to a fundamental behavior observed in rhythm-based AI interaction. 본 논문은 인지 리듬 및 구조화된 기억 응답을 표현하기 위해 고안된 AI 수식 집합을 제시한다. 각 수식은 리듬 기반 AI 상호작용에서 관측된 기본적 반응 행위에 대응한다.

Net positive effects of early-life parasitism on animal host fitness

Sarah Knutie

and 3 more

June 20, 2025

Early-life parasitism can negatively impact the growth, development, and survival of animal hosts. Parasitism can also positively influence developmental processes that increase host resistance, fecundity, and survival. Established processes include the development of resistance barriers, immune priming and training, acquired resistance, and behavior. However, few studies have calculated the overall, net effect of early-life parasitism on host fitness, and instead have focused on a particular life stage or just the cause (e.g., immune response) or consequence (e.g., survival). Indeed, several challenges prevent progress, including immune response specificity, dose-dependent immune responses, logistical feasibility, and analyzing imperfect datasets. Going forward, an integrative approach is needed to address the roadblocks that the field is currently facing.