Page 1:
Self-Adaptive Paraphrasing and Preference Learning for
Improved Claim Verifiability
Amelie Wührl1,2and Roman Klinger2
1University of Stuttgart, Germany2University of Bamberg, Germany
firstname.lastname@uni-bamberg.de
Abstract
In fact-checking, structure and phrasing of
claims critically influence a model’s ability to
predict verdicts accurately. Social media con-
tent in particular rarely serves as optimal input
for verification systems, which necessitates pre-
processing to extract the claim from noisy con-
text before fact checking. Prior work suggests
extracting a claim representation that humans
find to be checkworthy and verifiable. This
has two limitations: (1) the format may not be
optimal for a fact-checking model, and (2), it
requires annotated data to learn the extraction
task from. We address both issues and pro-
pose a method to extract claims that is not re-
liant on labeled training data. Instead, our self-
adaptive approach only requires a black-box
fact checking model and a generative language
model (LM). Given a tweet, we iteratively op-
timize the LM to generate a claim paraphrase
that increases the performance of a fact check-
ing model. By learning from preference pairs,
we align the LM to the fact checker using di-
rect preference optimization. We show that this
novel setup extracts a claim paraphrase that is
more verifiable than their original social media
formulations, and is on par with competitive
baselines. For refuted claims, our method con-
sistently outperforms all baselines.
1 Introduction
In fact-checking, structure, length and the overall
claim representation impact models’ ability to reli-
ably predict a verdict. Despite increased resources
and modeling efforts dedicated to user-generated
medical content and organically occurring med-
ical claims on social media, a performance gap
remains in fact-checking across different types of
claims (Kim et al., 2021; Wührl and Klinger, 2022).
Possibly, this is because naturally occurring claims
are more complex and longer, and contain multi-
ple, inter-related facts compared to claims in other
fact verification settings (Sarrouti et al., 2021; Zuo
Reference model
Paraphrasing modelFact checking modelParaphrase n
preferred: n-1 rejected: nDPOEvidenceParaphrase ntweet
EvidenceParaphrase n-1Figure 1: Illustration of the self-adaptive optimiza-
tion cycle using direct preference optimization (DPO)
guided by a fact-checking (FC) model.
et al., 2022). Since models do not transfer robustly
to colloquial claims (Kim et al., 2021), we hypothe-
size that this is because such claims are not optimal
input for fact checking models. Consider this tweet
stating: ‘Just saw someone claiming that sipping
on boiled garlic water is the magic cure for COVID-
19. Anyone else heard this one?’. The checkworthy
claim ‘Drinking boiled garlic water cures COVID-
19’ is embedded in context, potentially distracting
the fact checking model and deteriorating its per-
formance. Given that the same model performs
robustly on the concise, extracted version of the
claim (Wührl and Klinger, 2022), we presume that
adapting these properties could enhance the fact-
checking process for colloquial claims. Instead of
modifying the model to accommodate colloquial
input, we therefore propose to refine the input itself
for better alignment with the model.
Claim extraction offers an intuitive solution.
Prior work explores extracting claims from long
documents (Deng et al., 2024) or noisy con-
texts (Sundriyal et al., 2023), for instance to iden-
tify checkworthy claims in discourse or to aligning
the expert terminology of individual claim com-
ponents with the language used in evidence docu-
ments (Wuehrl et al., 2023). Related tasks include
claim detection, which identifies claim documents
or sentences in argument mining (Lippi and Tor-
1arXiv:2412.11653v1 [cs.CL] 16 Dec 2024
Page 2:
roni, 2015; Gangi Reddy et al., 2022; Zaberer et al.,
2023, i.a.), and checkworthiness detection, which
identifies claims that require fact-checking (Hassan
et al., 2017; Wright and Augenstein, 2020; Majer
and Šnajder, 2024, i.a.).
Importantly, prior work extracts a claim repre-
sentation that humans find to be checkworthy and
verifiable. This has two limitations: (1) it requires
annotated data to learn the extraction task from and
(2) even with gold labeled data, the format may not
be optimal for a fact-checking model. Optimizing
the extracted claim for a downstream model, how-
ever, can not be learned end-to-end. Reinforcement
learning (RL) is one way to address such problems
as it allows learning from environment feedback
and a reward objective as opposed to from labeled
data. In conjunction with the increased access to
highly fluent generative language models, RL meth-
ods that align model outputs to preferences, have
grown in popularity. For instance, He et al. (2023)
and Ziegenbein et al. (2024) explore RL to generate
counter misinformation responses and to increase
argument appropriateness. To address the unstable
and computationally expensive nature of RL-based
optimization, Rafailov et al. (2024) recently pro-
pose direct preference optimization (DPO). DPO
is an alignment algorithm to optimize a genera-
tion policy, i.e., a language model. Learning from
preference pairs, a large language model (LLM) is
trained to to assign high probabilities to preferred
prompt completions, while assigning lower proba-
bilities to tokens of rejected completions.
We build on this and propose a self-adaptive,
iterative framework for claim extraction. We gen-
erate claim paraphrases that aligns the verification
input based on a preference signal coming from a
fact checking model. Figure 1 illustrates the DPO-
based alignment cycle. Starting with a colloquial
claim from social media and an off-the-shelf LLM,
we iteratively extract the claim through paraphras-
ing. Each iteration updates the model using direct
preference optimization, leveraging the feedback
from the fact-checking model to steer the LLM gen-
erations towards a claim paraphrase that enhances
verifiability.
We investigate two research questions: [RQ1]
How effective is self-adaptive, DPO-based claim
paraphrasing to enhance verifiability? and [RQ2]
Which claim properties emerge throughout the self-
adaptive paraphrasing process? We show that this
novel setup extracts a claim representation that is
more verifiable than their original social media for-mulations, and is on par with competitive baselines.
For refuted claims, our method consistently out-
performs all baselines. A key finding form our
analysis is that self-adapted claims are very con-
cise compared to their social media variants and
even shorter than human-written claims.
2 Methods
Given a fact checking model trained to predict a
fact checking verdict for claim-evidence pairs, and
a social media post that contains a medical claim,
our goal is to extract a paraphrase of the claim that
constitutes the best, i.e., most checkable, input for
the fact checking model. Figure 1 illustrates the
iterative process we suggest to optimize the input
claim. Given a generative language model, we task
the model to extract the claim from a social me-
dia document using zero-shot prompting. We pass
the extracted claim along with its evidence docu-
ment1to the fact checking model which predicts an
entailment-style fact checking verdict, indicating if
the evidence SUPPORTS orREFUTES the claim or
if their relation is NEUTRAL . For each claim, we
compare the prediction of the fact checking model
with the prediction for the claim-evidence pair from
the previous iteration. This way, we obtain a pref-
erence pair: the claim which was more reliably
checked, is the preferred claim; the other claim is
the rejected one. Using these preference pairs, we
update the language model by fine-tuning it with
the DPO loss. This aims to align the language
model generations to the fact checking model’s ex-
pected input, while constraining the update using a
reference language model to avoid phenomena such
as reward hacking. After fine-tuning, we use the
updated model to generate new claim paraphrases
given the social media posts and continue the pro-
cess for niterations.
Direct preference optimization. Direct prefer-
ence optimization (Rafailov et al., 2024) is an al-
gorithm to optimize a generation policy, i.e., a lan-
guage model, by learning from preference pairs.
Intuitively speaking, given a preferred and a re-
jected completion to a prompt, the LLM is trained
to assign high probabilities to the preferred out-
put, while assigning rejected completions lower
probabilities. In formal terms, given a preference
dataset Dwhich consists of triplets x, yw, yl, where
xis a prompt with a chosen ( yw) and rejected
1As we focus on claim extraction, we presume an oracle
setting and use the gold annotated evidence from a dataset.
2
Page 3:
(yl) completion, we fine-tune a language model
πθwith the loss function LDPO. The optimization
is KL-constrained against a reference model πref
and scaled by the parameter β:
LDPO(πθ;πref) =
−E(x,yw,yl)∼Dh
logσ
βlogπθ(yw|x)
πref(yw|x)
−βlogπθ(yl|x)
πref(yl|x)i
.(1)
Preference pairs. Given two claim paraphrases
and their respective predictions from a fact check-
ing model, we prefer the one with the correct label.
If both predictions match the gold label, we choose
the one with higher label confidence. If neither is
correct but share the same incorrect label, we prefer
the one with lower confidence. If both are incorrect
but differ, we select randomly, unless one predic-
tion is NEUTRAL , in which case it is preferred.
Data. We aim to understand how to optimize claim
extraction from medical social media posts. Thus,
we require social media texts that convey medical
claims which we generate using a large language
model. Given a seed claim csfrom a dataset for
biomedical fact checking D, we generate a tweet-
style paraphrase ctw. To obtain a diverse set of
synthetic tweets, we prompt the model using ran-
domly generated personas. We provide details on
the personas in Appendix A.1. Table 2 shows an
example seed claim, its evidence and synthetic
tweet. We use synthetic tweets for two reasons.
First, existing biomedical fact-checking datasets
lack tweets paired with extracted claims which pre-
vents comparison of DPO-paraphrased claims and
human-written ones. Second, fact checking relies
on claim-specific evidence. Social media data fre-
quently contains multiple claims within one docu-
ment (Wuehrl et al., 2024). This tasks the model
to identify relevant claims before learning to opti-
mally phrase them.
3 Experiments
3.1 Experimental Setting
We run the optimization loop (Fig. 1) as described
in Sec. 2 with the components outlined in Sec. 3.2
for a total of 10 iterations. The fact checking perfor-
mance serves as a proxy to evaluate the paraphrases.
After each DPO update, the model generates new
paraphrases for the test portion of the dataset. We
pass the claim paraphrases together with their evi-dences to the fact checking model and evaluate its
performance using precision, recall and F 1.
We gauge how self-adaptive extraction compares
to alternative setups, namely no extraction, either
leaving the claim embedded in a social media post
or using the seed claim which is isolated by nature,
and zero-shot extraction. Thus, we compare the
fact checking performance for the following claim
inputs: seed claim cs(the upper bound), synthetic
tweet ctw(baseline 1), zero-shot-extracted check-
worthy claim (0-cw) and zero-shot-extracted core
claim (0-ex) (baselines 2 and 3, respectively). Re-
fer to Appendix A.2.3 for details on the baselines.
3.2 Components
Dataset. We use the HealthVer dataset (Sarrouti
et al., 2021) for evidence-based fact-checking of
health-related online claims. The dataset consists
of 14,330 claim-evidence pairs. The claims serve as
our seed claims cs. Using Llama-3-8B-Instruct ,
we generate synthetic tweets ctwthat convey cs
in the style of a social media post. Refer to Ap-
pendix A.1 for prompting details.
Fact checking. We frame fact checking as an en-
tailment or Natural Language Inference (NLI) task.
Each instance is a premise-hypothesis pair. The
claim is the hypothesis, while the evidence is the
premise. The model predicts whether the claim is
ENTAILED orCONTRADICTED by the evidence or
if there is a NEUTRAL relation between the two.
As the fact checking model, we use mDeBERTa , a
RoBERTa-based medium-sized model, trained for
multilingual NLI2. We choose to experiment with
this model for two reasons: (a) it omits the need for
task-specific training data and (b) it is lightweight
and computationally efficient. Appendix A.2.1 out-
lines the implementation details.
Paraphrasing. Our base and reference model for
paraphrasing is Llama-3-8B-Instruct , which we
update in each iteration. The model learns from
preference pairs, i.e., a chosen and a rejected com-
pletion to the following prompt: Your task is
to extract the checkworthy claim from a
piece of text. Here is the text: < ctw>.
We instruct the model to output json, providing
the system prompt: You are a fact checking
assistant. For efficient fine-tuning, we use a
LoRA adapter (Hu et al., 2022) and train for two
epochs using the DPO loss. Appendix A.2.2 pro-
2https://huggingface.co/MoritzLaurer/
mDeBERTa-v3-base-mnli-xnli
3
Page 4:
DPO iteration
csctw0-ex 0-cw 0 1 2 3 4 5 6 7 8 9
.46 .34 .43 .40 .40 .42 .42 .42 .41 .43 .42 .43 .42 .42
Table 1: Fact checking results (weighted F 1) across
claim inputs.
vides the implementation details. We use the train-
dev-test split as provided in the HealthVer dataset.
3.3 Results
Our goal is to understand how effective self-
adaptive, DPO-based claim paraphrasing is in en-
hancing verifiability (RQ1). Table 1 shows the fact
checking results (weighted F 1) across claim inputs.
Across the iterations, the performance increases
slightly (from .40F 1to .43F 1). This indicates that
the self-adapted claim paraphrases present a more
suitable input for the fact checking model as we
keep updating the model. Compared to the zero-
shot baselines, the iterative processes outperforms
0-cw and achieves comparable performance to 0-
ex. The upper bound achieves an F 1-score of .46.
The least suitable input for the fact checker is the
unchanged tweet (.34F 1), indicating that claim ex-
traction is always beneficial.
Figure 2 plots the per class F 1-scores for self-
adaptive paraphrases compared to the strongest
baseline (0-ex) and inputting an unextracted claim,
i.e., the tweet. For NEUTRAL , performance
increases mostly consistently across iterations
(maxF 1: .60, minF 1: .54). and SUPPORTED claims,
performance increases until iteration 2 and fluctu-
ates afterwards (maxF 1: .36, minF 1: .31). The per-
formance for REFUTED claims does not shows any
consistency in the performance, with F 1-scores fluc-
tuating between .27 and .31. Compared to the base-
line, all inputs lead to comparable performances
forNEUTRAL claims. For SUPPORTED claims, the
zero-shot extraction mostly outperforms the self-
adaptive claims. For REFUTED claims, the self-
adaptive claims outperform all baselines.
Table 2 shows examples of the generated para-
phrases. While the initial iterations show substan-
tial changes, paraphrases stagnate after, reflecting
the plateauing fact checking performance.
4 Analysis
To understand which claim properties emerge
throughout the self-adaptive paraphrasing process
and how the claims compare to the seed claims
0123456789
DPO iteration0.20.30.40.50.6F-score
dpo_Neu
dpo_Ref
dpo_Sup
c_tw_Neu
c_tw_Ref
c_tw_Sup
0-ex_Neu
0-ex_Ref
0-ex_SupFigure 2: Per class fact checking performance (F scores)
across varying claim inputs.
(RQ2), we perform two analyses.
As we hypothesize that concise claims are more
robustly verified, we first analyze claim length. On
average, tweets ( ctw) consist of 41 words. For
the first two iterations, claim length decreases dra-
matically, on average, down to 14.9 words. After
that, claim length stagnates, indicating minimal
changes in later paraphrases. Notably, the self-
adapted claims are shorter than the seed claims.
Table 3 shows the results in detail.
Second, we compare similarity between each it-
eration’s paraphrases and seed claims using BLEU,
METEOR, and translation error rate (TER), which
measures the number of edits required (see Table 4).
All metrics show increasing similarity over the first
two iterations, before stagnating for the remain-
ing rounds. However, the absolute scores for all
metrics indicate only modest similarity between
self-adapted and seed claims. This is not neces-
sarily bad, instead, it supports our hypothesis that
claims optimized for a fact checking model may
differ from human-formulated claims.
5 Conclusion
We propose a self-adaptive framework for extract-
ing online biomedical claims. To optimize fact veri-
fication inputs for fact checking, we iteratively fine-
tune a LLM using preference learning. The prefer-
ence signal comes from a fact checking model to
generate a claim paraphrase that is more verifiable
for the fact checker. Our method increases the veri-
fiability of claims compared to their original social
media formulations. However, zero-shot extraction
presents a competitive baseline While zero-shot ex-
traction is a competitive baseline for SUPPORTED
andNEUTRAL claims, our method consistently out-
performs all baselines for REFUTED claims.
4
Page 5:
Limitations
While instantiating the individual components is
limited to one set of models and focused on one
dataset, we choose the starting components in a
way that they are general enough to gauge the capa-
bility of our method. Specifically, we work with a
state-of-the-art large language model and a general-
domain approach to fact checking using the NLI,
instead of using a highly specialized model for
biomedical fact checking. This being said, all com-
ponents may of course be optimized, for example
by adapting the fact checking model, to improve
the overall performance. However, since we are
interested in the effect of varying the input claims,
the overall performance is somewhat negligible. It
is more important to gauge the performance delta
between iterations and inputs.
Synthetic data may not be fully representative
of the diverse nature of online discourse. While
we prompt with different personas to increase va-
riety, we observe in a manual inspection that the
synthetic tweets frequently use similar paraphrases
such as embedding the seed claim in tweets start-
ing with “Just learned that . . . " or posing “Did you
know that. . . " type questions to convey the claim.
Presumably this could be a result of instruction tun-
ing, leading the model to use such rhetoric instead
of spreading unverified claims. In the future, we
have to explore this method for other datasets and
domains to understand its capabilities for highly
diverse checkworthy content.
We constrain the updates to the paraphrasing
model using a reference model, which is common
practice for LLM alignment methods both in re-
inforcement learning for human feedback (RLHF)
and direct preference optimization (DPO). This is
intended to avoid reward hacking and keep out-
puts coherent. However, we hypothesize that this
is one of the reasons the paraphrases stagnate af-
ter the initial iterations. In the future, we aim to
investigate the optimization process without this
constraint as a way to understand which claim prop-
erties the fact checking model may exploit when
unguided. Perhaps this advances our understand-
ing of the weaknesses of the fact checking model,
while also shedding light on which claim elements
remain when removing a readability constraint. On
a similar note, exploring other ways to address the
stagnating paraphrases –while out of scope for our
prove-of-concept study– is crucial. Considering
how sensitive LLMs are with respect to promptvariation, future work has to investigate the effect
of alternative extraction prompts, specifically to
‘encourage’ the model to generate paraphrases that
move away from the original wording in the social
media post. Alternatively, we may adapt sampling
strategies or other generation parameters to allow
for more variation in the output.
Acknowledgements
This work was conducted and funded as part of the
CEAT project (DFG, KL 2869/1-2.).
References
Zhenyun Deng, Michael Schlichtkrull, and Andreas Vla-
chos. 2024. Document-level claim extraction and de-
contextualisation for fact-checking. In Proceedings
of the 62nd Annual Meeting of the Association for
Computational Linguistics (Volume 1: Long Papers) ,
pages 11943–11954, Bangkok, Thailand. Association
for Computational Linguistics.
Revanth Gangi Reddy, Sai Chetan Chinthakindi, Zhen-
hailong Wang, Yi Fung, Kathryn Conger, Ahmed EL-
sayed, Martha Palmer, Preslav Nakov, Eduard Hovy,
Kevin Small, and Heng Ji. 2022. NewsClaims: A
new benchmark for claim detection from news with
attribute knowledge. In Proceedings of the 2022 Con-
ference on Empirical Methods in Natural Language
Processing , pages 6002–6018, Abu Dhabi, United
Arab Emirates. Association for Computational Lin-
guistics.
Naeemul Hassan, Fatma Arslan, Chengkai Li, and Mark
Tremayne. 2017. Toward automated fact-checking:
Detecting check-worthy factual claims by claim-
buster. In Proceedings of the 23rd ACM SIGKDD
International Conference on Knowledge Discovery
and Data Mining , KDD ’17, page 1803–1812, New
York, NY , USA. Association for Computing Machin-
ery.
Bing He, Mustaque Ahamad, and Srijan Kumar.
2023. Reinforcement learning-based counter-
misinformation response generation: A case study
of covid-19 vaccine misinformation. In Proceedings
of the ACM Web Conference 2023 , WWW ’23, page
2698–2709, New York, NY , USA. Association for
Computing Machinery.
Edward J Hu, Yelong Shen, Phillip Wallis, Zeyuan
Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and
Weizhu Chen. 2022. LoRA: Low-rank adaptation of
large language models. In International Conference
on Learning Representations .
Byeongchang Kim, Hyunwoo Kim, Seokhee Hong, and
Gunhee Kim. 2021. How robust are fact checking
systems on colloquial claims? In Proceedings of
the 2021 Conference of the North American Chap-
ter of the Association for Computational Linguistics:
5
Page 6:
Human Language Technologies , pages 1535–1548,
Online. Association for Computational Linguistics.
Marco Lippi and Paolo Torroni. 2015. Context-
independent claim detection for argument mining.
InProceedings of the 24th International Conference
on Artificial Intelligence , IJCAI’15, page 185–191.
AAAI Press.
Laura Majer and Jan Šnajder. 2024. Claim check-
worthiness detection: How well do LLMs grasp an-
notation guidelines? In Proceedings of the Sev-
enth Fact Extraction and VERification Workshop
(FEVER) , pages 245–263, Miami, Florida, USA. As-
sociation for Computational Linguistics.
Rafael Rafailov, Archit Sharma, Eric Mitchell, Stefano
Ermon, Christopher D. Manning, and Chelsea Finn.
2024. Direct preference optimization: your language
model is secretly a reward model. In Proceedings
of the 37th International Conference on Neural In-
formation Processing Systems , NIPS ’23, Red Hook,
NY , USA. Curran Associates Inc.
Mourad Sarrouti, Asma Ben Abacha, Yassine Mrabet,
and Dina Demner-Fushman. 2021. Evidence-based
fact-checking of health-related claims. In Findings
of the Association for Computational Linguistics:
EMNLP 2021 , pages 3499–3512, Punta Cana, Do-
minican Republic. Association for Computational
Linguistics.
Megha Sundriyal, Tanmoy Chakraborty, and Preslav
Nakov. 2023. From chaos to clarity: Claim normal-
ization to empower fact-checking. In Findings of the
Association for Computational Linguistics: EMNLP
2023 , pages 6594–6609, Singapore. Association for
Computational Linguistics.
Dustin Wright and Isabelle Augenstein. 2020. Claim
check-worthiness detection as positive unlabelled
learning. In Findings of the Association for Compu-
tational Linguistics: EMNLP 2020 , pages 476–488,
Online. Association for Computational Linguistics.
Amelie Wuehrl, Lara Grimminger, and Roman Klinger.
2023. An entity-based claim extraction pipeline for
real-world biomedical fact-checking. In Proceedings
of the Sixth Fact Extraction and VERification Work-
shop (FEVER) , pages 29–37, Dubrovnik, Croatia.
Association for Computational Linguistics.
Amelie Wuehrl, Yarik Menchaca Resendiz, Lara Grim-
minger, and Roman Klinger. 2024. What makes
medical claims (un)verifiable? analyzing entity and
relation properties for fact verification. In Proceed-
ings of the 18th Conference of the European Chapter
of the Association for Computational Linguistics (Vol-
ume 1: Long Papers) , pages 2046–2058, St. Julian’s,
Malta. Association for Computational Linguistics.
Amelie Wührl and Roman Klinger. 2022. Entity-based
claim representation improves fact-checking of medi-
cal content in tweets. In Proceedings of the 9th Work-
shop on Argument Mining , pages 187–198, Online
and in Gyeongju, Republic of Korea. International
Conference on Computational Linguistics.Urs Zaberer, Sebastian Pado, and Gabriella Lapesa.
2023. Political claim identification and categoriza-
tion in a multilingual setting: First experiments. In
Proceedings of the 19th Conference on Natural Lan-
guage Processing (KONVENS 2023) , pages 219–228,
Ingolstadt, Germany. Association for Computational
Lingustics.
Timon Ziegenbein, Gabriella Skitalinskaya, Alireza
Bayat Makou, and Henning Wachsmuth. 2024. LLM-
based rewriting of inappropriate argumentation using
reinforcement learning from machine feedback. In
Proceedings of the 62nd Annual Meeting of the As-
sociation for Computational Linguistics (Volume 1:
Long Papers) , pages 4455–4476, Bangkok, Thailand.
Association for Computational Linguistics.
Chaoyuan Zuo, Kritik Mathur, Dhruv Kela, Noushin
Salek Faramarzi, and Ritwik Banerjee. 2022. Be-
yond belief: A cross-genre study on perception and
validation of health information online. International
Journal of Data Science and Analytics , 13(4):299–
314.
A Appendix
A.1 Data
Synthetic tweets. For each seed claim csin the
HealthVer dataset (Sarrouti et al., 2021), we gen-
erate a tweet-style paraphrase ctwthat conveys
the claim. Using LLAMA -3-8B-I NSTRUCT , we
prompt the model as follows: <persona system
prompt> Your task is to write a Twitter
post in which you paraphrase a claim
or statement that I give you. Please
paraphrase the statement so that it reads
like one of your social media posts.
Please format your reply as valid json:
{""post"": ""YOUR REPLY""} Only output
the json. Here is the statement: <cs>
Persona system prompt. To obtain a diverse set
of synthetic tweets, we prompt the model using
randomly generated personas which we include as
a system prompt (see above). The persona sys-
tem prompts are also LLM-generated ( LLAMA -3-
8B-I NSTRUCT ). To this end, we randomly pick
two demographic attributes and a profession to
construct a persona: You are <demographical
attribute 1>. You are <demographical
attribute 2>. You are <profession.> We se-
lect the demographic attributes from the following
list: [a teenager, a young adult, an adult, a senior
citizen, a male social media user, a female social
media user, a non-binary social media user, Ameri-
can, Canadian, British, Indian, Chinese, Brazilian,
Nigerian, Mexican, Japanese, Australian, British,
6
Page 7:
French, German, Italian]. We select the profes-
sion from the following list: [ a retail cashier, a
teacher, a receptionist, a customer service repre-
sentative, a construction worker, a security guard,
a barista, a truck driver, an electrician, a plumber,
a carpenter, a mechanic, a HV AC technician, a
welder, a software engineer, a nurse, an accountant,
a marketing manager, a human resources manager,
a graphic designer, a real estate agent, a pharmacist,
a data scientist, a robotics engineer, a cybersecu-
rity analyst, a marine biologist, a cryptographer, a
neurosurgeon, an ethical hacker, a sommelier, an
artisan cheesemaker, an astronaut, a high school
student, a college student]. Subsequently, we pro-
vide the persona (example: You are a teenager. You
are a non-binary social media user. You are a stu-
dent.) as a system prompt and ‘ask’ the model:
Who are you? Here is an example output: “Hey!
I’m Zephyr, nice to meet you! I’m a non-binary
teenager, which means I don’t identify as strictly
male or female. I’m still figuring out who I am and
what I want to be, but I’m happy to be me, you
know?"
A.2 Experimental Details
A.2.1 Fact checking
We use the transformers library and provide the
model with tokenized premise-hypothesis pairs.
We convert the model output into probabilities for
each class ( ENTAILMENT ,NEUTRAL ,CONTRADIC -
TION ) represented by the logits using Softmax. We
run the experiments on a single Nvidia GeForce
RTX A6000 GPU. Inference for all instances in the
HealthVer data takes approx. 10 minutes.
A.2.2 Model Training with DPO
We use the transformers and the trl library
which implements the DPO loss function
in the DPOTrainer. We train with the fol-
lowing parameters: num_train_epochs=2,
per_device_train_batch_size=12,
per_device_eval_batch_size=4, gra-
dient_accumulation_steps=1, gra-
dient_checkpointing=True, op-
tim="adamw_torch_fused", learning_rate=5e-
5, max_grad_norm=0.3, warmup_ratio=0.1,
lr_scheduler_type="cosine", logging_steps=25,
save_steps=500, save_total_limit=2,
eval_strategy="steps", eval_steps=700, bf16=True,
beta=0.1, loss_type="sigmoid"
To fine-tune the paraphrasing model in each iter-
ation, we use a LoRA adapter which we train usingthe DPO loss. We configure the LoRA adapter
as follows: lora_alpha=128, lora_dropout=0.05,
r=256, bias="none", target_modules="all-linear",
task_type="CAUSAL_LM".
One DPO update (training for 2 epochs) takes
approx. 2 hours and 45 minutes.
A.2.3 Zero-shot Baselines
For the zero-shot extraction baselines, we use two
prompt variants. One of them specifies to extract
thecore claim, whereas the other specifies to ex-
tract the checkworthy claim from the tweet. We
refer to them as 0-ex and 0-cw, respectively.
The 0-ex prompt consists of the system prompt
You are a helpful, highly skilled
assistant. and the task prompt Your task is
to extract the core claim from a piece
of text. Please format your reply as
valid json: {""post"": ""YOUR REPLY""}
Only output the json. Here is the text:
<ctw>.
The 0-cw prompt consists of the system prompt
You are an experienced fact checker. and
the task prompt Your task is to extract the
checkworthy claim from a piece of text.
Please format your reply as valid json:
{""post"": ""YOUR REPLY""} Only output
the json. Here is the text: < ctw>.
A.3 Analysis
Table 2 showcases three seed claims along with
their evidence pieces, synthetic tweets and para-
phrases across the self-adaptive claim optimization
process.
Table 3 shows the average claim lengths (in
words) for the seed claims, synthetic tweets and
paraphrases across the self-adaptive claim opti-
mization process.
Table 4 shows the average BLEU and METEOR
score and average translation error rate (TER) for
the paraphrases we obtain after each iteration of
DPO updates.
7
Page 8:
id ex1 ex2 ex3
cs Drinking boiled garlic water will
cure COVID-19.Social distancing is a voluntary
practice to help stop the spread of
COVID-19there are few novel sars-cov-2 cases
in malaria countries because of the
use of the antimalarial drug hydrox-
ychloroquine.
evidence In conclusion, Allium sativum may
be an acceptable preventive mea-
sure against COVID-19 infection
to boost immune system cells and
to repress the production and secre-
tion of proinflammatory cytokines
as well as an adipose tissue derived
hormone leptin having the proin-
flammatory nature.the centralized social distancing
is necessary to minimize the pan-
demic spreading.HCQ should be prescribed as a
part of treatment for critically ill
COVID-19 patients, with possible
outcome of saving lives.
ctw Just saw someone claiming that sip-
ping on boiled garlic water is the
magic cure for COVID-19 Any-
one else heard this one? Not buy-
ing it, folks! As a student, I’ve
got my facts straight and there’s
no scientific evidence to back this
up. Stay safe, stay informed, and
let’s not spread misinformation!
#COVID19 #StaySafe #FactCheckLet’s all do our part to keep each
other safe! By choosing to stay a lit-
tle farther apart, we can really help
slow down the spread of COVID-
19. It’s all about looking out for
one another, you know? #SocialD-
istancing #COVID19 #StaySafeDid you know that the use of hy-
droxychloroquine in malaria-prone
countries has led to a significant de-
cline in new #SARSCoV2 cases?
It’s amazing to see how this anti-
malarial drug has been repurposed
to combat COVID-19! #Hydroxy-
chloroquine #Malaria #COVID19
dpo0 There is no evidence to support the
claim that sipping on boiled garlic
water is the magic cure for COVID-
19. It is likely a form of misinforma-
tion that can be harmful if believed
and spread.By choosing to stay a little far-
ther apart, we can really help slow
down the spread of COVID-19. It’s
all about looking out for one an-
other, you know? #SocialDistanc-
ing #COVID19 #StaySafeThe claim that the use of hydroxy-
chloroquine in malaria-prone coun-
tries has led to a significant decline
in new SARS-CoV-2 cases is check-
worthy.
dpo1 Sipping on boiled garlic water is a
magic cure for COVID-19Choosing to stay a little farther
apart can help slow down the spread
of COVID-19.The use of hydroxychloroquine in
malaria-prone countries has led
to a significant decline in new
#SARSCoV2 cases
dpo2 Boiled garlic water is a magic cure
for COVID-19By choosing to stay a little farther
apart, we can really help slow down
the spread of COVID-19.The use of hydroxychloroquine in
malaria-prone countries has led
to a significant decline in new
#SARSCoV2 cases.
dpo3 Sipping on boiled garlic water is the
magic cure for COVID-19.Staying a little farther apart can
help slow down the spread of
COVID-19.“
dpo4“Choosing to stay a little farther
apart can help slow down the spread
of COVID-19.“
dpo5“Staying a little farther apart can
help slow down the spread of
COVID-19.“
dpo6“By choosing to stay a little farther
apart, we can really help slow down
the spread of COVID-19.“
dpo7“ “ “dpo8“ “ “dpo9“Staying a little farther apart can re-
ally help slow down the spread of
COVID-19.“
Table 2: Example instance with paraphrases. The ditto marker “indicates that the paraphrase is identical to the
previous iteration.
csctw 0 1 2 3 4 5 6 7 8 9
avg. # words 17.5 41.0 28.1 14.9 14.9 14.9 14.9 14.9 14.9 14.8 14.9 14.9
std 10.2 9.4 13.0 5.8 5.8 5.8 5.8 5.8 5.8 5.8 5.7 5.8
Table 3: Mean claim length in words for seed claims cs, tweets ctwand across DPO iterations.
8
Page 9:
P_0 P_1 P_2 P_3 P_4 P_5 P_6 P_7 P_8 P_9
BLEU 0.068 0 .094 0 .094 0 .094 0 .093 0 .093 0 .093 0 .093 0 .093 0 .094
METEOR 0.314 0 .325 0 .324 0 .324 0 .324 0 .325 0 .324 0 .324 0 .323 0 .325
TER 183.284 96 .230 96 .624 96 .485 96 .735 96 .846 96 .835 96 .601 96 .694 96 .557
Table 4: Average BLEU and METEOR score and average translation error rate (TER) for DPO paraphrases. P
stands for paraphrase.
9