Thanks to my different formations and activities, I managed to learn many skills useful to computer science.
As I have been president of a student assocation composed of 100 members for two years, I have been training my skills for working in cooperation, writing reports and communicating with ease. I also passed a first year in english litterature and civilization giving me a good level in english and enlarging my knowledge in linguistics.
My work as a web writer allowed me to cover well known events such as the Utopiales of Nantes.
These diverse skills are used in my everyday studying and working in computer science.
Search and study state of the art algorithms for extracting main content from HTML pages
Extract main content from HTML pages with a domain related approach (learn for a web domain, apply to close domains)
Detailed Description
The goal is to create an algorithm able to find useful information from web pages. The searched content is user advice that may be present on the page (user's psoeudonym, edition date, user's content). I work with 3 other students and I mainly work on finding the main content so a tuned L star algorithm can ask fewer questions to its human oracle.
Company Description
Dictanova is a french startup working on qualitative analyse of the web content.