Language plays a crucial role in day-to-day communication. The cognitive ability of humans to generate language effortlessly to express ideas, information, and
facts is nothing less than a sensational phenomenon. To express any form of information effectively, a speaker or a writer needs to dynamically alter the attributes
of a discourse, this may include gathering relevant facts, addressing a person or a
group, sticking to the ideas that are confined to a specific task, or situation, etc. In
Natural Language Processing, we utilize the task of Language Generation for applications such as text summarization, question-answering, text completion, sports
and weather reporting, etc.
Natural Language Generation (NLG) or Text Generation is an active sub-field
of NLP which aims to generate text indistinguishable from human-written ones.
Although with the rise of Recurrent Neural Networks and the Transformer architecture, text generation has made leaps and accomplished state-of-the-art performance
that is almost indistinguishable from natural language, the generation process is
still uncontrollable, randomized, and in several cases results in loss of context. In
this study, we explore the area of controlled text generation and offer a simplified
approach to solve this task, we achieve this by employing the model to learn the
underlying attributes of text and generate results confined to a dynamic and userspecified context.
To steer the generation of text and confine it to a context that is built upon
a specified set of attributes is a challenging task. We limited our focus to only
three fundamental attributes of text such as keywords, content, and style and employed the GPT-2 transformer-based language model to a semi-supervised learning
approach, by taking the text instance with its extracted attributes and binding them
together using an encoding format. We trained the model with this encoding so it
learns the underlying representation and become capable of producing the output if
presented the attribute set dynamically. Each control attribute was evaluated based
on its own nature of representation in the output text. The keyword attribute was
evaluated based on how well the keyword set was reflected in the final output, the
style attribute was evaluated by pre-existing classifiers and we trained a BiLSTM
classifier on the raw dataset to evaluate the content attribute of the generated text.
Finally, the generated text was evaluated for its fluency and plausibility.
The automatic evaluation shows promising results and demonstrates the model’s
capability to produce the given context with ease. While the human evaluation
provides almost similar results to the automatic ones, it is still limited due to its
small-scale nature and lack of available resources specific to this study. During stresstesting the model we found that the model was able to maintain plausibility in the
results for most of the cases even though it was given constraints in form of unusual
inputs. However, the ability of the model to reproduce the given context fulfills our
initial hypothesis for this study to investigate the semi-supervised learning approach
for the task of controlled text generation and provide a reasonable evaluation that
supports necessary proof of concepts, while also considering there is still room for
plenty of exploration and challenges to overcome.