Skip to content
Snippets Groups Projects
Commit 4fc47f84 authored by Lucas Schons's avatar Lucas Schons
Browse files

Add test for clean_html.R

parent 83529e59
No related branches found
No related tags found
2 merge requests!34Resolve "Add unit tests for clean_html.R",!27Resolve "Add unit tests for cleanHtml.R"
......@@ -8,7 +8,7 @@ library(rprojroot)
## Set up nlp
init_nlp("conda", "spcy")
# init_nlp("conda", "spcy")
## Fetch data
......@@ -30,11 +30,11 @@ results <- pbapply(articles, 1, function(article) {
## Data cleaning
cleaned.text <- wikiproc::cleanHtml(article[4])
cleaned.text <- wikiproc::clean_html(article[4])
## Data preprocessing/annotating
annotation <- create_annotations(cleaned.text, article[2], article[3], data.dir = data_dir)
# annotation <- create_annotations(cleaned.text, article[2], article[3], data.dir = data_dir)
## Extract information from Text
......
Academician Hasan Abdullayev (also spelled as Gasan Mamed Bagir ogly Abdullaev; Azerbaijani: Həsən Məmmədbağır oğlu Abdullayev ; Russian: Гасан Мамед Багир оглы Абдуллаев ; August 20, 1918 – September 1, 1993) was a leading top Soviet and Azerbaijani physicist, scientist and public official, President of the National Academy of Sciences of the Azerbaijan SSR. He was a Doctor of Sciences in physics and mathematics, Professor of physics and mathematics, Director of the Institute of Mathematics and Physics of the National Academy of Sciences of the Azerbaijan SSR, full Academician of the National Academy of Sciences of the Azerbaijan SSR, corresponding member of the Soviet Academy of Sciences and Russian Academy of Sciences, and in 1970-1983 was the longest-serving President of the National Academy of Sciences of the Azerbaijan SSR. He was also an elected member of the Azerbaijan SSR Parliament, and the elected member of the 8th, 9th and 10th convocations of the Supreme Soviet of the Soviet Union. Academician Abdullayev was one of the founders of the Soviet semiconductors physics and a leading scientist in new technologies. He made an outstanding contribution to the development of electronics, astrophysics, aeronautics, medicine, biophysics and defense industries. Academician Abdullayev was the author of 585 Soviet and foreign patents, including 171 secret and 65 top secret patents, author of 28 scientific books (monographs), over 800 journal and encyclopedia articles in English, Russian and Azerbaijani languages.
Hasan Abdullayev was born on August 20, 1918, in Yaycı, Nakhchivan during the time of the Azerbaijan Democratic Republic. He died on September 1, 1993, in Baku, and was buried at the Alley of Honor.
Hasan Abdullayev's name was memorialized by naming the Institute of Physics of the Azerbaijan Academy of Sciences, which he led and expanded into a world-class scientific research institute in 1957-1993, after him, as well as naming a street in downtown Baku, installing a plaque on the apartment complex he lived in, and naming a primary school in Nakhchivan. Additionally, several scholarships named after him have been awarded to undergraduate, graduate and post-graduate science students in Azerbaijan from 2003. Every five years conferences dedicated to his scientific heritage have been held in Baku, such as in 2013, 2007, and 2003. Spoke native Azerbaijani, was fluent in Russian and German, as well as English. Married, with three children, and six grandchildren.
Academician Abdullayev dedicated over fifty years of his life to the physics of semiconductors. Discovered new groups of binary and ternary compounds of selenium and tellurium, suggested diodes with controlled electronic memory, created complex semiconductors used as receivers for visible and infrared spectrum areas. By researching the physics of selenium and selenium appliances, was the first to explain the abnormalities in selenium and invented an approach to control them. Carried out a set of research projects to receive semiconductor monocrystals of complex chemical composition for lasers and memory modules. Elaborated new semiconductor materials for heat converters.
In 1954, Hasan Abdullayev founded the Department of Semiconductor Physics at the Baku State University (BSU). Abdullayev founded the Nakhchivan and Gyandja branches of the Azerbaijan SSR Academy of Sciences and established more than 50 scientific production and construction bureaus, which were tasked with the application of scientific theories and discoveries, and their more rapid introduction into production and life, in the republic.
According to a 2010 article published in the Russian scientific journal Physics and technique of semiconductors of the Joffe Institute, dedicated to the 60th anniversary of semiconductor electronics research in the USSR, one of the important roles in Soviet semiconductor electronics research, development and innovation was done by academician Abdullayev.
Academician Abdullayev's lifelong research and work concentrated on chemical elements selenium and tellurium, their applications in semiconductors, biophysics and nuclear sciences.
Zhores Alferov, the Nobel-prize winning physicist, praised the work and legacy of his late colleague and friend, academician Abdullayev, recognizing how hard it was for Azerbaijani scientists to rise even within USSR, much less in the world, and only a few people as Abdullayev managed to do it, creating new industries and directions in physics and other sciences.
At the initiative and under the direct leadership of academician Hasan Abdullayev the following research and scientific institutions and initiatives were established: Academician Hasan Abdullayev was honored with the top Soviet award - the Order of Lenin in 1978, the Order of the Red Banner of Labour, the Vavilov Gold Medal of the Federation of Cosmonautics Siolkovsky Gold Medal of the Federation of Cosmonautics, was laureate of Azerbaijan SSR State Award in 1972, was an Honored Scientist of Azerbaijan SSR, and with other medals and prestigious Soviet and international scientific awards.
Academician Hasan Abdullayev is the author of 28 monographs, several scientific textbooks, approximately six hundred scientific journal articles. He holds 585 patents from USSR (including 171 secret and 65 top secret patents for technologies with military applications), and 35 foreign patents from France, Germany, Great Britain, Japan, Sweden, Italy, Bulgaria, India, and U.S. (United States Patent 3,472,652).
Academician Abdullayev received highest praise from his colleagues, including Nobel Prize winner academician Zhores Alferov, Nobel Prize winner academician Alexander Prokhorov, Kurchatov Institute President and Director Evgeny Velikhov, academician Bentsion Vul, academician Vladimir Tuchkevich, academician Sergey Kapitsa, academician Roald Sagdeev, Nobel Prize winner professor Rudolf Ludwig Mossbauer, academician Nikolay Bogolyubov, Soviet Academy of Sciences Presidents academician Alexander Nesmeyanov, academician Anatoly Petrovich Alexandrov, academician Mstislav Keldysh and other Soviet and foreign scientists.
According to a 2008 article, "Academician Abdullayev was called the Father of Physics in Azerbaijan and one of the Founders of the School of Semiconductor Research in the Soviet Union by such authoritative scientists as academicians Zh.Alferov, Yu.Gulyaev, L.Kurbatov, V.Isakov, Professor D.Nasledov, and others. In fact, the Great Soviet Encyclopedia, the most authoritative Soviet encyclopedia - the Soviet equivalent of the Encyclopædia Britannica in the West, listed the names of scientists, making the greatest contributions to the development of semiconductor electronics and microelectronics in this order: A.F.Ioffe (who was Abdullayev's mentor during his postdoctoral studies in Leningrad), N.P.Sazhin, Ya.I.Frenkel, B.M.Vul, V.M.Tuchkevich, H.B.Abdullayev, Zh.I.Alferov, L.V.Keldish, and others (Third Edition, 1970, page 351). Thus, already in 1970, this encyclopedia put academician Abdullayev as the sixth most influential scientist in semi-conductor research, higher than such giants as Academicians Alferov and Keldish!"
Academician Abdullayev was recognized as the top expert on the chemical element selenium, and thus entrusted authoring the article on selenium in the third (final) edition of the top scientific reference publication - the Great Soviet Encyclopedia. Original quote in Russian: "Модель с использованием структуры с p−n-переходом для объяснения выпрямления в селеновых выпрямителях предлагалась Д.Н. Наследовым и Г.Б. Абдуллаевым. Несмотря на многочисленные исследования, теория функционирования полупроводниковых выпрямителей на основе закиси меди и селена в течение многих лет не была создана."
Original quote in Russian: "Начиная с 1960-года, и примерно до 1987 года в Баку я был много раз. Затем приезжал сюда в 2003 году, принять участие в праздновании 85 лет со дня рождения моего друга, покойного президента Азербайджанской академии наук Гасана Багировича Абдуллаева. Тогда же я побывал в Институте физики Академии наук Азербайджана. Обрадовался, что он сохранился.... Но дело в том, что и в советское время азербайджанцам было нелегко иметь достаточно прочные позиции, не то, чтобы в мировой, но и в советской науке. Г. Абдуллаев был очень талантливым физиком. Он понимал, что физика полупроводников - широкая область. Для развития промышленности нужно развивать многое. Но в целом Институт должен иметь свое лицо. И он его создал - это слоистые полупроводники на основе селена, которые нашли массу применений в опцеэлектронике, в оптике. И это очень хорошо. Люди на этом росли и развивались. Появился целый ряд отраслевых организаций. Я не могу сказать как обстоят дела с физикой в Азербайджане сегодня, но думаю, что они далеки от благополучия."
Original quote from the Great Soviet Encyclopedia in Russian: "Большой вклад в создание Полупроводниковой электроники внесли советские учёные — физики и инженеры (А. Ф. Иоффе, Н. П. Сажин, Я. И. Френкель, Б. М. Вул, В. М. Тучкевич, Г. Б. Абдулаев, Ж. И. Алферов, К. А. Валиев, Ю. П. Докучаев, Л. В. Келдыш, С. Г. Калашников, В. Г. Колесников, А. В. Красилов, В. Е, Лашкарёв, Я. А. Федотов и многие др.)." А. И. Шокин. Полупроводниковая электроника. Большая советская энциклопедия. — М.: Советская энциклопедия 1969—1978.
This diff is collapsed.
context("test-clean_html")
test_that("html cleansing works", {
filename_raw <- "article-4-raw.html"
filename_cleansed <- "article-4-cleansed.txt"
html <- readChar(filename_raw, file.info(filename_raw)$size)
expected <- readChar(filename_cleansed, file.info(filename_cleansed)$size)
actual <- clean_html(html)
expect_equal(expected, actual)
})
context("test-cleanhtml")
test_that("multiplication works", {
expect_equal(2 * 2, 4)
})
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment