AUTHOREA
Log in Sign Up Browse Preprints
BROWSE LOG IN SIGN UP

Preprints

Explore 13,737 preprints on the Authorea Preprint Repository

A preprint on Authorea can be a complete scientific manuscript submitted to a journal, an essay, a whitepaper, or a blog post. Preprints on Authorea can contain datasets, code, figures, interactive visualizations and computational notebooks.
Read more about preprints.

Empowering users - self-service metabolomics data analysis for everyone?
Jianguo Xia
Backstories

Jianguo Xia

and 1 more

December 14, 2017
The initial motivation for developing MetaboAnalyst was to save time for myself. I started my PhD with Dr. David Wishart at the University of Alberta. During that time period, the main focus of the lab was, of course, the Human Metabolome Database (HMDB). The development of a metabolomics core facility was also at its full speed. As part of my PhD training, I was involved in a metabolomics study on urine samples from cancer cachexia patients. At that time, the only bioinformatics tool for metabolomics data analysis was a commercial software - SIMCA-P (Umetrics). We purchased a copy of the tool which came with a comprehensive manual. Although I could perform some “standard” data analysis to produce the numbers and graphics as seen in many metabolomics publications, I soon realized its limitations - many approaches I would like to try were not supported. I then played with Weka (https://www.cs.waikato.ac.nz/ml/weka/), a widely-used java-based machine learning tool, for classification and regression analysis. However, it lacks many features specially needed for metabolomics data analysis. In the end, I taught myself R to perform data analysis. This worked well for a short time - I analyzed the data the way I wanted, generated impressive graphics, and produced analysis reports using Sweave & Latex. However, the process soon became less enjoyable when more collaborators requested their data to be analyzed in a similar fashion. A better way is to let someone else in the lab do it. The best way is to let researchers analyze their own data - most of them are highly educated and understand the basic principles behind most analysis methods. At that time, I was the only one in the lab who knew R and statistics - how can I let other people with some basic knowledge to perform the same analysis I would do? In 2008, I started thinking seriously about developing a biologist-friendly tool for metabolomics data analysis. One of the advantages of being last in the “omics” race is the benefit of hindsight. Many of the approaches developed from other omics fields are not domain-specific and can be adapted for metabolomics. For instance, the GenePattern tool suite \citep{Reich_2006} developed by the Broad Institute gave me a lot of inspirations. Other important considerations include - be web-based, respond at real time, and be implemented in the languages I know (Perl, Java and R). During a lab meeting in the summer of 2008, I proposed this idea to David. He was a bit uncertain as he knew that I had no formal training in developing web based applications (note: I obtained my MSc in Immunology after I graduated from a 5-yr Medicine program). I was very enthusiastic and said I could get this done by the end of year. He smiled and encouraged me to pursue in this direction. As most analysis methods and graphics were already implemented in R, the key challenge was to put these functions on the web through user-friendly interface. I wanted to use a technology that will not expire soon. The Perl CGI based web framework was losing its ground at that time. Java had a lot to offer in terms of web frameworks. However, many of them are too “heavy” for me to learn in a short time. Eventually, I chose the then relatively new JavaServer Faces (JSF) technology. The next technical challenge was how to efficiently communicate between R and Java to deal with concurrency (i.e. supporting multiple users to perform data analysis at the same time). The Rserve (https://www.rforge.net/Rserve) developed by Simon Urbanek came to my rescue. I spent around three months to complete the first prototype, which captured all the steps I would do for metabolomics data analysis. The web interface was designed to be quite “conversational” and acted as a playground to allow users to freely explore many useful statistical analysis methods once their data parse certain sanity checking, processing and normalization. MetaboAnalyst (version 1.0) was published in 2009 at Nucleic Acids Research \citep{Xia_2009}. It enables a researcher with a basic understanding of metabolomics and statistics to perform data analysis to generate a comprehensive analysis report. It was also heavily used by other members within our metabolomics group and saved a lot of my time. My next focus was on functional analysis of metabolomics data. Using the same infrastructure, I developed tools for metabolite set enrichment analysis \citep{Xia_2010}, metabolomic pathway analysis \citep{12235}, as well as time-series data analysis \citep{Xia2011}. They were eventually merged under the umbrella of MetaboAnalyst (version 2.0) for the ease of use and the convenience of maintenance \citep{Xia_2012}. While I was pursuing my PhD on bioinformatics for metabolomics, the next-generation sequencing revolution was in full swing. In 2012, I received two postdoctoral fellowships from the Canadian Institutes of Health Research (CIHR) and Killam Trust, to work on next-generation sequencing in Bob Hancock’s laboratory at the University of British Columbia (UBC). While at UBC, MetaboAnalyst was gaining steady increase in user traffics, and I felt obligated to maintain MetaboAnalyst and to keep addressing user requests. For instance, I added a biomarker analysis module to support a variety of common approaches clinicians would like to perform. With growing popularity, there were signs of performance issue - many colleagues experienced significantly slow responses when they used MetaboAnalyst for teaching in a large class.  I eventually decided to totally re-implement the software, with particular focus on addressing the performance bottlenecks in both Java and R functions. I also switched to the Google Computer Engine (GCE) for hosting the web application. The result is MetaboAnalyst 3.0 \citep{Xia_2015}. The impact of this update turned out to be very significant. Google Analytics showed that the submitted analysis jobs jumped from 500~800 jobs/day to 5000~8000 jobs/day, and the server downtime was also reduced significantly. We are actively developing MetaboAnalyst 4.0 at the time of writing. The key features will be to enable more transparent & reproducible analysis, better support for untargeted metabolomics, and integration with other omics through advanced statistics and network analysis.
The Science Behind Lab-Grown Meat
Elliot Swartz

Elliot Swartz

December 10, 2017
note: This material is copied from a blog post from March 15, 2017, which can be additionally viewed here: https://elliot-swartz.squarespace.com/science-related/invitromeat. IntroductionThe idea of meat consumption without the need of animals has been around for a long time.  Winston Churchill famously mentioned the concept in his 1932 compilation Thoughts and Adventures [1] and “carniculture” was mentioned in the old science fiction novel Space Viking [2].  More recently, scientists have realized that perhaps by utilizing traditional cell culture techniques, it would be possible to grow muscle cells (i.e. meat) in vitro for consumption.  This realization was culminated in 2013 with the presentation and consumption of the world’s first in vitro burger created by Mark Post and funded by Sergey Brin with a cool price tag of $330,000 (which was actually a bit mis-represented and included cost of setting up the lab) [3, 4].  The event was purposefully done to raise awareness for the strategy and has since spawned 4 [known] companies pursuing the idea – Dutch-based Mosa Meats (Mark Post’s company), U.S.-based Memphis Meats [5], Israel-based Supermeat [6], and Japan-based Shojinmeat [7].
Energy Storage Textile
cy.zhi

cy.zhi

December 10, 2017
This is the backstory of: From industrially weavable and knittable highly conductive yarns to large wearable energy storage textiles \cite{Huang_2015b}.
The advantages and applications of 3D cell cultures 
Dr. Maddaly Ravi

Dr. Maddaly Ravi

December 06, 2017
The advantages and applications of 3D cell cultures Maddaly Ravi Professor, Department of Human Genetics, Sri Ramachandra Medical College and Research Institute, Porur, Chennai, India. This back-story is about the paper published in Journal of Cellular Physiology titled “3D cell culture systems: Advantages and applications” whose citation is as follows: Maddaly Ravi, V. Paramesh, S.R. Kaviya, E. Anuradha, and F.D. Paul Solomon. 3D Cell Culture Systems: Advantages and Applications J. Cell. Physiol. 230: 16–26, 2014. Cell cultures are important material of study for the variety of advantages that they offer. Both established continuous cell lines and primary cell cultures continue to be invaluable for basic research and for direct applications. Technological advancements are necessary to address emerging complex challenges and the way cells are cultured in vitro is an area of intense activity. An important advancement in cell culture techniques has been the introduction of three dimensional culture systems. This area is one of the fastest growing experimental approaches in life sciences. Augmented with advancements in cell imaging and analytical systems, as well as the applications of new scaffolds and matrices, cells have been increasingly grown as three dimensional models. Such cultures have proven to be closer to in vivo natural systems, thus proving to be useful material for many applications. Cell lines, especially cancer cell lines, have contributed immensely in understanding the complex physiology of cancers. They are excellent material for studies as they offer homogenous samples without individual variations and can be utilised with ease and flexibility. Also, the number of assays and end-points one can study is almost limitless; with the advantage of improvising, modifying or altering several variables and methods. Literally, a new dimension to cancer research has been achieved by the advent of 3-Dimensional (3D) cell culture techniques. This approach increased many folds the ways in which cancer cell lines can be utilised for understanding complex cancer biology. 3D cell culture techniques are now the preferred way of using cancer cell lines to bridge the gap between the ‘absolute in vitro’ and ‘true in vivo’ and we continue our work using such in vitro models. The aspects of cancer biology that 3D cell culture systems have contributed include morphology, microenvironment, gene and protein expression, invasion/migration/metastasis, angiogenesis, tumour metabolism and drug discovery, testing chemotherapeutic agents, adaptive responses and cancer stem cells. Our MSc in Human Genetics program includes a dissertation project component which the students pursue in their final semester of their 4-semester course. As a faculty member, I had 2 students assigned to me (the authors of the paper in discussion, Ms. S.R. Kaviya and Mr. V. Paramesh) and my role was to be their dissertation project mentor and guide. As was my field of interest, I chose to work with 3D cultures with the two students assigned. This time again, as how I always approached (and continue to approach) student projects, one of the first steps would be to make them understand the requirement for a thorough review of literature before working on a project. Apart from making them understand the significance of thorough review of the available literature, I also emphasize on the importance of scientific manuscript writing and the manner in which manuscripts can be well prepared. Also, one of our research scholars Ms. E. Anuradha was working on designing scaffolds for tissue engineering applications at that time and was taken on board for the review along with our Head of the Department Prof. Solomon F.D. Paul. Thus, the back story of the publication “3D cell culture systems: Advantages and applications” began and we set out to initially arrive at the components of the review; the aspects we wanted to cover in the review. Once we had the review outline frozen, we started off on collecting the literature available from all possible sources (the primary, secondary and the tertiary) and analyzing/collating the details. The details were filled into the components framework that we decided for the review manuscript and we could complete the “sections” of the manuscript. After several revisions and editing, the net result was the comprehensive review on the advantages and applications of 3D cultures for studies involving differentiation, drug discovery for pharmacological applications, as tumor models for cancer research, for gene & protein expressions, for cell proliferation & cell-cycle analysis, for studies involving cytoskeleton, apoptosis, cell adhesion, cell signalling, cell motility, microenvironment, cell morphology, tissue architecture, drug response, and cell behaviour as co-cultures as the major components. Also, the various matrices and scaffolds used for 3D cultures were presented as a Table which gives the classification, properties, salient features, and the specific appropriate applications of the matrices and scaffolds. We were happy that the manuscript took a good shape and turned out as one which deals comprehensively in this exciting area of science. All authors contributed equally for all aspects of manuscript preparation and submission. Ms. Kaviya’s contribution for the formatting, and the post-acceptance processes such as Galley proof corrections were exemplary as with Ms. Anuradha’s contribution to the Table on scaffolds and matrices. We thank the editors, the reviewers and the editorial/production team of Journal of Cellular Physiology for the entire process that happed from the manuscript submission till its acceptance and finally, the publication of the manuscript. We are also happy that the review continues to be among the top “most accessed” papers of the journal from the time it was published on-line. The review has obtained 106 citations till date (6th December 2017) in a little more than 2 years of its publication. Now, the two PG students are well placed as research scholars and the research scholar Ms. E. Anuradha had obtained her Doctorate and is currently pursuing her post-doctoral career. Looking back at the story beginning, it sure gives a good feel that 2 MSc students could author a publication which was (and is) well received and they gained a firm footing/understanding of the steps involved in research methodology. Well, the three of us could author one more paper, this time, an original research paper titled “Culture phases, cytotoxicity and protein expressions of agarose hydrogel induced Sp2/0, A549, MCF-7 cell line 3D cultures” published in the Journal ‘Cytotechnology’ (Cytotechnology (2016) 68:429–441; DOI 10.1007/s10616-014-9795-z). The thorough review of literature sure helped us in what went into obtaining a second publication with the 2 students as co-authors. The story is still on and I do continue to work with 3D cultures of cancer cell lines and I hope to contribute my miniscule bit to this area of science, in whatever way I possibly can.
Lecture 15 - Diatomic molecules
Fred Jendrzejewski
Selim Jochim

Fred Jendrzejewski

and 2 more

December 04, 2017
In this lecture we will start to put atoms together to build simple molecules. We will first use the Born-Oppenheimer approximation, to eliminate slow processes from the study of the fast electron dynamics. Then, we will study simple mechanisms of binding atoms.
UpGoerFive LBG Christmas Challenge: LBG Open Innovation in Science Center 
Benjamin Missbach
Patrick Lehner

Benjamin Missbach

and 1 more

December 04, 2017
A large number of people know more about the world and how stuff works than a single person, right? Obviously, we want to understand how the world works, but we think that ideas from people like you and me are very very important: we want to understand the world together! We take seriously to make live better for people, so our job is to help those who are working on the really hard problems. Most of the time, our people sit behind closed doors, read papers, type into their computers and — by chance — miss day-to-day problems of people like you and me. We help them to be open and change their way of finding things out about the world. Being open is important and the key-word for us: it is at the heart of what we do. If you you want to work like this, you should come to us, we are living in a very cool city, you can visit with your horse, if you want — seriously!One approach of our job is to find new ideas and questions that are important for people. There are many good ideas already out there, we just need to catch them. We are wondering what different groups of people — we call them crowds — think about several problems they face in their day to day life. They should not be ignored and taken seriously. With this, we can find new problems that no-body ever thought of and give them to our people to work on these problems. 
Residential Permits Issuance and 311 Building Violations complaints 
Dana Chermesh
federica B bianco

Dana Chermesh

and 1 more

November 29, 2017
https://github.com/danachermesh/PUI2017_dcr346Instructor: Dr. Federica Bianco Abstract:    This study sought to analyze the possible correlation between the number of residential permits issuance and buildings violations, represented by 311 building related complaints. Using several sources of data including Census Bureau, NYC open data, 311 and NYC spatial data, a descriptive analysis and regression models were conducted to better understand the two urban factors. The results were insignificant, which not necessarily mean the correlation between the two isn't exist, rather than different or further methods could have better explain it. A more meaningful negative correlation was detected between renter-occupied housing units and BV complaints.Introduction:    New York City is a rapidly renewing urban area, with an escalating demand for housing and increasing housing costs. My motivation for this research was to analyze how does the number of residential permits issuance correlate with 311 complaints related to buildings violations,  if at all, and by this, ideally, to identify areas that should get more attention regarding building codes and building use validation. I relate to permit issuance as an indicator of urban renewal, due to the fact that the majority of construction in New York City requires a Department of Buildings permit, this to make sure that the plans are in compliance with building code. I pre assumed that an area with a very low number of residential permits issued over a year (meaning, an area that is less developing / renewing) will also show a relatively large number of building violation. I also guessed there are highly renewing areas with large number of permits issued and a large number of building violation complaints. Additionally, I was interested in assessing the role of renter / owner occupancy ratio on building violations complaints.     Data:     This research rely on data from year 2016, and focus on residential information only. Any personal information was excluded. The analysis was performed in the granularity level of Zip codes, which seemed a reasonable geographical unit to observe urban renewal trends.      The study required the use of several data sources. First, Permit Issuance data were obtained from DOB permits issuance open data and were cleaned to include Residential permits only, then were filtered again to include only New Buildings (NB) and massive Alternation (AL) permit types, ignoring plumbing, signs, equipment etc. permit types, that are insignificant to this research. The permits data were normalized by the overall number of occupied housing units, obtained from the US Census Bureau, American Fact Finder website, using the ACS 5 years estimate data. Data for 2016 do not exist in the zip code geographical level; for that reason I used from year 2013, assuming the change in the number of housing units is not meaningful. All data were grouped by zip code to count the number of permits issued in each zip code in 2016.    The data of the Department of Buildings (DOB) violations are divided to more than a hundred complaint categories, most of them are meaningless to urban renewal. It was hard to define the exact categories that will best contribute to this analysis. In order to avoid misinterpretations, 311 complaints data were selected instead. The 311 data were filtered to include only Building related complaints in year 2016. The 311 complaints are also divided by complaint descriptor; the descriptors were included in this analysis are:Illegal Conversion Of Residential Building/SpaceIllegal Commercial Use In Resident ZoneZoning - Non-Conforming/Illegal Vehicle StorageNo Certificate Of Occupancy/Illegal/Contrary To COSRO - Illegal Work/No Permit/Change In Occupancy/UseROOFINGPORCH/BALCONYSKYLIGHTGUTTER/LEADERFENCING    311 is a relatively new citizens-city engagement system, of which not all citizens are taking advantage or aware. To overcome this bias the 311 data were normalized by dividing each zip code's number of building-related complaints by its overall number of 311 complaints. Due to the large size of the 311 dataset the overall number of complaints in 2016 was assessed by extracting two months only from that year, January and June, proxies for winter overall complaints and summer overall complaints respectively (see Ipython notebook). The data were grouped by zip code to count their number of building violation complaints. The weakness of the 311 data is that even when neutralizing the bias in the citizens' use of the 311 system, it is harder to address the differences between citizens' involvement and engagement level in the city and their feeling about complaining.      Additionally, for the second part of the analysis, data of number of renter-occupied housing units and  owner-occupied housing units were obtained, also from the American Fact finder website. Finally, New York City's zip codes shapefiles were included, also obtained from NYC open data website.      Methodology:     The first step of the analysis was to observe and describe the data of both primarily variables, Permit Issuance and 311 Building Violation complaints. The distribution of the normalized variables were viewed to assess the possible similarity in their statistical behaviour:
Spatial analysis showing that deprived people are more subjected to road-traffic acci...
Cécile Nyffeler

Cécile Nyffeler

November 26, 2017
Nyffeler CécileEcole Polytechnique Fédérale de LausanneIntroductionRoad-traffic accidents represent the ninth cause of death worldwide (Murray and Lopez, 1997). It is thus important to get some insight on which societal group might be more exposed to such hazards. A clear association could be made between the probability of injuries by car crashes and the poverty of the person which was hit during the crash (Aguero-Valverde and Jovanis, 2006).  This finding was supported by several other studies.  Siddiqui et al. (2012) were indeed likewise able to affirm that lower median household incomes could be associated with higher road-traffic accidents probability. The aim of this paper is to judge if those findings are applicable to the communal level and to be able to state that the people living in poorer neighborhoods are indeed more vulnerable to car crashes than wealthier regions of the municipality. The commune of Vernier, Switzerland, was selected on the grounds that it is a highly contrasted municipality, and might thus be representative of what may happen at larger scales.Data  Several vector layers and text files were used in order to proceed to this analysis. The vector layers were all provided by the OpenData service of the Canton of Geneva SITG. Geographical point data containing the accident locations and housings addresses were used, as well as polygon layers defining the extent of the municipality zone and the inhabited areas.  The inhabited areas were characterized using a hectometric grid. The demographic data about the allowances were probably taken from the Swiss Federal Office of statistics FSO (not clearly mentioned in the data set). 
Scholarly Publishing, Freedom of Information and Academic Self-Determination: The UNA...
Ernesto Priego
Erin McKiernan

Ernesto Priego

and 15 more

November 24, 2017
On February 1, 2015, the global information and analytics corporation Elsevier and the National Autonomous University of Mexico (UNAM) established the agreement UNAM-Elsevier contract DGAJ-DPI-39-081114-241, which saw the transfer from UNAM to Elsevier for the "production and hosting, advertising and support" of 44 Mexican open access academic journals published by UNAM. 
PREreview: Explanation implies causation? (Myint, Leek, & Jager, 2017)
C.H.J. Hartgerink
Laura Kunst

C.H.J. Hartgerink

and 2 more

November 24, 2017
A preprint review of "Explanation implies causation?" by Myint, Leek, & Jager. Posted on bioRxiv, November 13 2017. doi: 10.1101/218784
A new definition of statistical significance
Thomas F Heston

Thomas F Heston

November 23, 2017
The classical definition of statistical significance is p <= 0.05, meaning a 1/20 chance the test statistic found is due to normal variation of the null hypothesis. This definition of statistical significance does not represent the likelihood that the alternative hypothesis is true. Hypothesis testing can be evaluated using a 2x2 table (shown below). Box "a" = true positives: p <= 0.05 and the alternative hypothesis is true. This is the study's power. A rule of thumb is that study power should be at least 80% (80% of the time the statistical test is positive when the alternative hypothesis is true). Therefore a = 0.80. Box "b" = false-positives: p <= 0.05 but the alternative hypothesis is false. By definition, when p = 0.05 the test statistic has a 5% probability of occurring by chance when the null hypothesis is true. Therefore, b = 0.05. Box "c" = false-negatives: p >= 0.05 but the alternative hypothesis is true. This occurs 20% of the time when the study's power is 80%. Therefore, c = 0.20. Box "d" = true-negatives: p >= 0.05 and the null hypothesis is true. This occurs 95% of the time when p <= 0.05. Therefore, d = 0.95. From this table we derive: Sensitivity = power = a/(a+c) = 80%. Specificity = (1-p) = d/(b+d) = 95%. Positive predictive value = power/(power + p-value) = a/(a+b) = 94%. Negative predictive value = d/(c+d) = 83%. The classical definition of statistical significance is (1-specificity) and does not take power into consideration. The proposed new definition of statistical significance is when the positive predictive value of a test statistic is 95% or greater. To arrive at this, the cut-off p-value representing statistical significance needs to be corrected for study power so that 0.05 > (p-value)/(p-value + power). To achieve a 95% predictive confidence,it can be derived that statistical significance is a p-value <= power / 19.
How often are statistically significant results clinically relevant? Not often
Thomas F Heston

Thomas F Heston

November 23, 2017
Objectives: Statistical significance does not equal clinical significance. This study looked at how frequently statistically significant results in the nuclear medicine literature are clinically relevant. Methods: A medline search was performed with results limited to clinical trials or randomized controlled trials, published in one of the major nuclear medicine journals. Articles analyzed were limited to those reporting continuous variables where a mean (X) and standard deviation (SD) were reported and determined to be statistically significant (p < 0.05). A total of 32 test results were evaluated. Clinical relevance was determined in a two-step fashion. First, the crossover point between group 1 (normal) and group 2 (abnormal) was determined. This is the the point at which a variable is just as likely to fall in the normal distrubution as the abnormal distribution. Jacobson's test for clinically significant change was used: crossover point = (SD1 * X2 + SD2 * X1) / (SD1 + SD2). It was then determined how many SD's from the mean this crossover point fell. For example, 13.9 +/- 4.5 compared to 9.2 +/- 2.1 was reported as statistically significant (p < 0.05). The crossover point is 10.7, which equals 0.71 std from the mean: 13.9 - (0.71*4.5) = 9.2 + (0.71*2.1).   Results: The average crossover point was 0.66 SD's from the mean. The crossover point was within 1 SD from the mean in 26/32 cases, and in these cases averaged 0.45 SD. Thus, for 4 out of 5 statistically significant results, when applied to an individual patient, the cut-off between normal and abnormal was 0.45 SD from the mean. This results in a third of normal patients falling into an abnormal category. Conclusions: Statistically significant results frequently are not clinically significant. Statistical significance alone does not ensure clinical relevance.    
Energy Benchmarking of Washington DC Apartment Buildings
Nicholas Jones

Nicholas Jones

November 22, 2017
As my extra credit project, I propose to examine what factors drive energy use at office buildings in Washington DC.Washington is one of a small number of cities that have energy disclosure laws. While the dataset was introduced later than New York's Local Law 84, it provides rich data with which to analyze and benchmark energy use.Motivation:Washington DC has a significant number of large apartment complexes with more than 100 residential units. However, these differ in age and design characteristics: some date from 1900-1945; many others were built in 1945-1975; and a new generation of large, expensive apartment blocks is going up.Should the city government focus on the older residential buildings - which may have poor build quality and lack good insulation - or should it focus on the fancy new buildings, which are ostensibly high-tech and energy efficient, but also have swimming pools and large floor layouts? The project should provide an answer by prioritizing which set of buildings to focus on based on their relative energy consumption.Data availability:The Washington DC Department of Energy and Environment (DOEE) publishes annual data including EUI / sq foot, and weather normalized EUI.City planning data makes available floorspace, building age, and amount of the lot covered by the building.Approach:The project would merge and clean the relevant datasets.A linear regression model would be constructed with building EUI as the dependent variable.During the exploratory phase, additional data columns may be discovered and incorporated into the model.The validity of the regression model would be tested against a training set of Washington DC buildings, and the model would be optimized through feature selection, as well as checking and correcting for multi-collinearity.Expected outputs:A simple predictive model for apartment building energy consumption in Washington DC together with its results.Identification of the top large apartment blocks whose EUI differs from the predicted value, visualized on a map.Identification of features associated with high energy use (such as the energy use tendencies for 1950s, 1970s and 2000s buildings).
Establishing a Regional Blockchain Innovation Cluster in Health Care
Thomas F Heston

Thomas F Heston

November 21, 2017
Blockchain technology has great potential to revolutionize healthcare data management. The technology is sufficiently complex, however, making it essential that a large number of people with a broad range of skills will be required to implement the technology. Innovation clusters will be the primary means of producing blockchain breakthroughs in healthcare by bringing together computer scientists, medical experts, and business people in the pursuit of a common goal. Healthcare innovation clusters are most likely to be centered around medical universities, but will also include in close geographic proximity technology, business, and medical insurance organizations.
A Theory of Everything.
Hontas Farmer

Hontas Farmer

November 18, 2017
Space is all that there is, just not the space as we live in it every day. The true space has position but not as we know it. Yet from it we can find all the forces that make our world.
Making new ways of understanding very small building-blocks of life using picture-tak...
Martin Jones

Martin Jones

November 17, 2017
We use picture-taking-boxes with a very large make-bigger-power to take pictures of the very small building blocks that make up humans and other animals. We can also see the tiny things that do important stuff inside these building blocks. These tiny things are interesting because we need to understand them to see why people get sick and how to make them better. If there isn't a box that does what we need, we build a new one that does! Sometimes we even make boxes that go inside other boxes so we can take different kinds of pictures all at once to tell us different things.The boxes make lots of pictures so we need good ways to understand them quickly with computers. To do this we need to carefully give the computer lots of orders and numbers to control what they do so they do good work. At the moment, even with the best computers this is too hard and takes too long, so we need to think of better ways to train the computer to do good work more quickly. Often we have to ask a person to do the work as they still do it best, but slowly.We have started to ask tens of hundreds of people around the world to help us as they can do good work together more quickly than one person. We can also get them to help us train the computers to do better work in up-coming times. The people that help us are aged from school children all the way up to older people who don't have to go to work any more, and everything in between. 
Open hardware for random access plate storage
Theo Sanderson

Theo Sanderson

November 17, 2017
We describe a design that uses 3D printed parts and off-the-shelf components to create an inexpensive system for automated storage and retrieval of microplates. In this way, a robotic system for storage of 24 plates can be built for less than $400.
Make sick people better again
Hinrich W.H. Göhlmann

Hinrich W.H. Göhlmann

November 15, 2017
My work is about two types of studies. In one study we look at cells from a normal person and a sick person. We find out what is different between them. In the other study we look at many possible ways to make a person better again. You can not buy most of these in a shop as they have not been studied in all possible ways to be sure that they really do what they should. In a next step we use a computer to match what we know about sick people with the best possible way to make them better again. This best possible match becomes the starting point for coming up with a new way to get sick people better again. It does not happen very often at all, but sometimes we can even find something that you can already buy. In those cases we can help people very quickly \cite{Lamb_2006}.
Cancer: beautiful body
Ammar Husami

Ammar Husami

November 14, 2017
A document by Ammar Husami, written on Authorea.
Folding, Funding and Phyre - a tool-building quest to solve one of the biggest proble...
Lawrence Kelley

Lawrence Kelley

November 14, 2017
HistoryI decided to be a scientist when I was 7 years old after watching Carl Sagan on TV talk about neutron stars. By 16 I discovered the protein folding problem: a disarmingly simply stated problem whose solution lay at the heart of everything in biology. How hard could it be?So a degree in Biochemistry later, I did a PhD in the lab of a computational molecular modeller (Mike Sutcliffe) and got started on the problem of clustering protein structures from NMR. I wrote my first program in C and built my first web site with my characteristically stupid pun in the name - OLDERADO: On-Line Database of Ensemble Representatives and Domains [1] - to look at the results. Each morning I would go on the internet as it then was to check the dozen or so new sites that had sprung up the night before and which had been manually added to a list known as the ‘internet directory’ by someone ‘out there’ on the new web. Google appeared about a month later. Websites were going to be a big deal. Being an early adopter of web-based science seemed like a good bet.Then a move to London to be with Mike Sternberg, a world expert on protein structure prediction. I inherited an early protein fold recognition program (Foldfit [2]) written by Rob Russell, Mansoor Saqi, Paul Bates and Roger Sayle (of Rasmol fame) and had to improve it. The result was 3D-PSSM [3] and I made a ‘fancy’ website (read: 1990’s black background with gold, animated logo) for other people to use the program and for me to figure out where the bugs were. It worked great - we won the automated section of the international CASP competition in structure prediction, and hundreds of people started using the site. It was one of the first of its kind, which made it easier to stand out. Throughout I was mainly making the site to help me understand how my software worked in various situations. With so many different types of data to look at, I needed a way to sensibly display this info just to understand what was going on internally. This remains a critical focus.New techniques arose in the field and I sat on my laurels with 3D-PSSM. I neglected the site and realised only later how valuable a large internet audience is and how I had squandered it at its peak. So I vowed to try again and rebuild it from the ground up, this time properly. Using newfound skills in web development I created Phyre [4]. This did well in both scientific recognition and number of users so I was very pleased. But soon funding looked iffy and was about to run out. Funding I wondered if any users could help. So I asked on the Phyre page if anyone could write a letter of support for us that we could include with our application for funding. We received over 1,000 letters of support which was amazing and had a huge effect on our application. But I think it placed the funding bodies in a difficult position. Making web sites that help scientists use state-of-the-art tools is undoubtedly a good thing. These site are continuously working, 24 hrs/day reliably doing some work to help science. They seem like safe and sensible places to direct funding - IF they are widely used. IF they are widely used, the funding body would look deliberately negligent if they failed to fund it. But at the same time they are justified in thinking this is outside their remit. They decide on science funding, not tools. So new arms of the funding bodies started to form, dedicated to tools and resources. This is where we are today. But deciding how to handle funding for tools is a new challenge we are all still grappling with. The quest for funding is ever-present. It is critical that tools are maintained. How do you justify the maintenance cost? It is easy to fall through the funding cracks - it’s neither proper science, nor pure infrastructure. I get very paranoid about funding living in this limbo. But if user numbers and citations continue to increase, we are safe. These are the only ways to prove your value in this game. But this hopefully reflects real-world utility of the tools you create. So what’s the best way to maximise the utility of these tools to researchers?Build it, and they will comeGreat tools based on state-of-the-art algorithms often go unused and unnoticed - great science gets ‘hidden’ in the hands of a few people competing intensely over small absolute changes in accuracy using programs designed for no one but the developers. The gains to society and science by enabling more people to use these tools are far greater. But then they need to be far easier to use if more people are to use them. Hence my focus on user interfaces. The best way to learn what works for user interfaces is by using lots of them. And thinking about the types of question a user may have. Simply BE a user and every time you want to do something and CAN’T, write a program to do it. Accumulate these programs in a web site. This empowers me and anyone else who cares, with these new tools - a virtuous circle.‘BE the user’ sounds great. But it gets harder as you spend more time with computer development and less time working on a biological problem with real proteins. I don’t mean wet lab. Just computational analysis problems with specific proteins faced by real scientists. That’s the hardest part to remain connected to. For now I just imagine scenarios. I imagine all the combinations we can make by connecting our tools in different ways, and build some that seem most promising. I meet users at workshops to find out what they want. But this is where the next development needs to happen. How can I communicate effectively with 50,000 users about what they want and what I can deliver? This is the biggest challenge for me right now.Despite the funding complexities, we got funding for Phyre2 [5]. THIS time I can do it properly, I thought. Use the most up to date facilities in the browser to make it look nicer, easier to use, and add those new ideas I had or were suggested to me by users at workshops (PhyreAlarm, BackPhyre, One-2-one threading). I also tried to improve how a user could look both at sequence and structure at the same time in the browser to analyse a range of features. This led to the development of Phyre Investigator (one module in Phyre2) and honed my skills at javascript for the future. This most recent paper has again done well in the citation game which I hope reflects its usefulness to researchers ‘out there’ on the web.FoldingIt’s important to remember the reason I’ve done any of this, and that reason is the mystery of protein folding. The person that solves the protein folding problem (at least I hope it’s a person) won’t do it with pen and paper. It will be someone extremely adept at using a variety of tools that probably already exist individually, and who will put them together in the right way. That is generic problem solving. Make this easier for people, and problems in general get solved faster. With folding my main fear is this being one of the first major science problems where artificial intelligence gets there before us. Google DeepMind is after it, and we’ve seen what they’ve done to the game Go.Protein folding is the molecular biology equivalent of the Goldbach conjecture: The impossible task that no one would sensibly pay you to work on but you want to work on because of its elegant simplicity and deep importance. So you need to do something useful while you ponder the impossible. You find the most useful thing you can do for everyone else and for you. Keep it as close as you can to folding whilst maintaining an audience size that justifies your existence - i.e. homology modelling. Problems that no one has been able to solve for decades typically don’t fall over from expected directions of attack. To be a long-standing problem means most lines of attack that occur to people have been tried. So new lines must be found by looking further afield, in other areas of science where an analogous problem may have been thwarted. But to see the mapping between a problem in say physics or economics or mathematics, to the folding problem requires a decent understanding of both areas. So that means trying to learn physics, signal processing, quantum mechanics, maths, computer science, AI, etc. All in the hope of seeing a new angle from which to attack the problem. And it’s fun and fascinating. But you need to make yourself useful to the world and justify your salary. My most recent direct stab at folding involved eigen decomposition of protein contact maps. As you may surmise, I have not yet been contacted by Stockholm.Where to publish?Well it’s not really science is it? Its tools to do science. So it doesn’t sit easily with most journals. You can submit an ‘Application Note’ which is about a page describing your web server. This doesn’t really count as a proper scientific paper in many eyes. Or often authors will use their tool to do some rapidly thrown together analysis that makes what would otherwise be an application note pass for a normal scientific article. Neither of these scenarios is ideal. The way I see it, when I’ve made a tool I haven’t answered a biological question, but I’ve made it easier to answer many biological questions. In the end we went for Nature Protocols which is aimed at step by step instruction for how to use previously published tools. It fits, albeit somewhat uncomfortably.FutureSo Phyre2 has done very well in attracting users and now I’m on to Phyre3 which will be out shortly (end of 2017). Again it’s a full redesign, this time with two people on it, me on the front end and my colleague Stefans Mezulis on the back-end, using the browser to its best, new web tech, polished with a new engine (the PhyreEngine, naturally) and more hardware for faster processing of bigger genomes. Also an entirely new tool called PhyreRisk is being developed: a portal for analysing disease, mutations, structures and complexes. But the primary focus is still the same: utility. What do people want or need to do and what can be done computationally to help them. The better we can match what can be done computationally with what researchers want to do, the faster we will make progress in all of science. And maybe we’ll beat DeepMind to the answer to protein folding. ReferencesOLDERADO: On‐line database of ensemble representatives and domains. LA Kelley, MJ Sutcliffe, Protein science 6 (12), 2628-2630 (1997)Recognition of analogous and homologous protein folds - assessment of prediction success and associated alignment accuracy using empirical substitution matrices. Russell RB, Saqi MA, Bates PA, Sayle RA, Sternberg MJ. Protein Eng. Jan;11(1):1-9. (1998)Enhanced genome annotation using structural profiles in the program 3D-PSSM. LA Kelley, RM MacCallum, MJE Sternberg. Journal of molecular biology 299 (2), 501-522. (2000)Protein structure prediction on the Web: a case study using the Phyre server. LA Kelley, MJE Sternberg. Nature protocols 4 (3), 363-371. (2009)The Phyre2 web portal for protein modeling, prediction and analysis. LA Kelley, S Mezulis, CM Yates, MN Wass, MJE Sternberg. Nature protocols 10 (6), 845-858 (2015)
Sauna Bathing and the Cardiovascular System
Thomas F Heston

Thomas F Heston

November 14, 2017
Thomas F Heston, MD, FAAFP1 1Department of Medical Education and Clinical Sciences, Elson S Floyd College of Medicine, Washington State University, Spokane, Washington USA Citation: Heston TF. Sauna bathing and the cardiovascular system. Int J Sci Res (Ahmedabad). 2017 Nov; 6(11) 569-570.
Title
Dana Chermesh
Federica B. Bianco

Dana Chermesh

and 1 more

November 14, 2017
PUI2017 Extra Credit Project Proposalby: Dana Chermesh (dcr346)  https://github.com/danachermesh/PUI2017_dcr346Instructor: Dr. Federica Bianco Problem Description:My motivation for this assignment is to analyze how does the number of residential permits issuance correlate  with housing violations number, if at all, and by this to identify areas that should get more attention regarding building codes and building use validation. I relate to permit issuance as an indicator of urban renewal, due to the fact that the majority of construction in New York City requires a Department of Buildings permit to make sure that the plans are in compliance with Building Code (from nyc.gov). I pre assume that an area with a very low number of residential permits issued over a year (meaning, an area that is less developing / renewing) will also have a relatively large number of building violation (illegal conversion). I also guess there are highly renewing areas with large number of permits issued and a large number of building violation complaints. Additionally, I am interested in assessing the roll of income in the predicting of housing violations. 
A Case Study in Blockchain Healthcare Innovation
Thomas F Heston

Thomas F Heston

November 13, 2017
Healthcare complexity and costs can be decreased through the application of blockchain technology to medical records and insurance companies. Estonia has taken a leadership role in blockchain based services both in the commercial sector and in government. The Estonian government’s innovation strategy was to create GovTech partnerships to implement blockchain based technologies throughout the country, and become a global leader in the technology. Starting in 2011, just 3 years after Satoshi Nakamoto published the first description of distributed ledgers and blockchain technology, the Estonian Government started partnering with the private technology startup company Guardtime to use blockchains to secure public and internal records. Then in 2016, Estonia once again reinforced its global leadership in blockchain technology when it announced it would use blockchain technology to secure the health records of over a million citizens. Estonia’s systematic method of applying blockchain technologies through GovTech partnerships demnostrates how innovation is a process. Estonia also identified early the value of the blockchain as a disruptive platform innovation. The application of blockchain technology to healthcare is a radical innovation given that nearly all previous applications have been in the financial and legal sectors.
A Case Study in Blockchain Healthcare Innovation
Thomas F Heston

Thomas F Heston

November 13, 2017
Healthcare complexity and costs can be decreased through the application of blockchain technology to medical records and insurance companies. Estonia has taken a leadership role in blockchain based services both in the commercial sector and in government. The Estonian government’s innovation strategy was to create GovTech partnerships to implement blockchain based technologies throughout the country, and become a global leader in the technology. Starting in 2011, just 3 years after Satoshi Nakamoto published the first description of distributed ledgers and blockchain technology, the Estonian Government started partnering with the private technology startup company Guardtime to use blockchains to secure public and internal records. Then in 2016, Estonia once again reinforced its global leadership in blockchain technology when it announced it would use blockchain technology to secure the health records of over a million citizens. Estonia’s systematic method of applying blockchain technologies through GovTech partnerships demnostrates how innovation is a process. Estonia also identified early the value of the blockchain as a disruptive platform innovation. The application of blockchain technology to healthcare is a radical innovation given that nearly all previous applications have been in the financial and legal sectors.
← Previous 1 2 … 561 562 563 564 565 566 567 568 569 … 572 573 Next →
Authorea
  • Home
  • About
  • Product
  • Preprints
  • Pricing
  • Blog
  • Twitter
  • Help
  • Terms of Use
  • Privacy Policy