Authors: Avinash Kuchipudi | Harendra Prasad B | Raghu ram Ambati |Ashok Reddy |Vivek Velagala
(Group 6)
Synopsis
The number of experiments being conducted on internet tend to have high number of observations. The analytics driven business of websites and e-commerce need these experiments to identify parameters which can help to develop the business. Natural experiment model is one such method where the several unknown parameters can be identified. The article explains a few techniques for researchers in navigating the online field experiment arena.
Why field experiments?
To evaluate new ideas, guide product development, and improve interface design.
In a lab experiment, the experimenter has a huge control on exogenous variables and hence the results obtained in the lab experiment may not be representative of the actual behavior. Hence to validate the lab experiment, researchers often resort to field experiments. But these field experiments when conducted in the physical form can be prohibitively costly due to requirement of lots of time and effort and other limitations. To overcome the limitations in physical field experiments, the article summarized about the possibility of conducting field experiments in the using online methods/platforms.
Generally, experiments are designed to understand the Behavior of customers. This can be conducted in physical form that is recruit customers, assemble them at one place or observe them at the place of customer itself. Within the selected set of customers, they are divided into groups and they are shown or exposed to same or different scenarios. As mentioned in this approach, we can see that this has pros like of ability to observe directly the behavior of customer from close quarters. But these physical experiments are very costly to run, and it is very difficult to identify and recruit representative and suitable sized sample of such participants. To overcome such cons, the article suggests that we move to an easier and cost-effective method of online experiments using technology available to us.
Online field experiments
The article demonstrated the benefit of using online methodology through savings in terms of cost, finding representative set of participants, ability to recruit large no of participants, ability to expose to multiple (more number of) scenarios.
The article has taken academic research done on online field experiments and presented a brief on variety of technologies available to conduct the online experiments which cover a broad spectrum of online sites including social networking, user-generated content, e-commerce, crowdfunding, online games, crowdsourcing etc.
Technologies for intervention
Email and Text
Method: Emails with varying content, time, length etc. are sent to the enrolled group of participants. Hypotheses for each group are arrived at based on the content sent. The results are tracked and analyzed to establish (fail to reject) or reject the hypothesis.
The articles explained how email is used by a lot of researchers to get insight into the behavior of customers. For example, an experiment is run on the user of movie rating website where the hypothesis from collective effort model are tested.
Results
- Highlighting uniqueness by asking to rate least rated movies will increase the probability of user rating the movie
- Setting specific rating goals (individual or group) will results in higher no of ratings
- Highlighting the benefit of rating to users, groups or others didn’t yield any positive response
Other Observations
The rating activity peaked during the experiment, but it also sustained at higher level at higher levels even after closing of experiment.
Similarly, another successful email intervention was done on a microlending website to match potential lenders and entrepreneurs. This experiment conducted with 22,233 participants also has given successful returns in-terms of increased lender engagement with significant increase in lending activity of previously inactive teams.
Text messages can also be used in similar context of email intervention, but their use is limited by the limit of the text/characters visible on the screen and hence is used in situations where the penetration of mobile phones far exceed the penetration of PCs.
Modified Web interface
Method: These methods of interventions are mostly used to analyze the effect of changes in user interfaces. Different participants are shown with same or different ads on different locations, orientations, sizes, shapes and the data are captured and analyzed to arrive at the objective (say highest no of clicks, highest conversion, ease of use etc.).
To explain Modified web interface, the article explained the experiment conducted at Yahoo! to investigate whether the competing sponsored advertisements placed at the top of a webpage (north ads), exert externalities on each other.
Methodology
The experiment is run of two randomly allotted groups of 100,000 observations each. One group is exposed to north ads with first two ads being competitors and the other group has non-competitor ads. The analysis of results suggested that having two competitor ads together have a positive externality (effect) on the ad at the top in search results.
In another experiment conducted on a social-media environment (Facebook), we can see that the researcher was able to onboard participants to the tune of 2.3million and 5.7million which is not easily possible and manageable had it been a physical experiment setup. This is another demonstration of ability of online experiment to be more effective and less costly.
As a tool for intervention, modified web interface can also be used in combination with emails
Bots
A Bot is a program or script that makes automated edits or suggestions. Since bots are designed to function autonomously, they can process a lot of data. This feature is used in designing the experiment on Wikipedia edit suggestions. The used is showed with targeted links, sites which he is most probable to edit based on his previous edit history. This way of targeted suggestions has shown an increase of nearly 4 times more edits than random suggestions.
Since bots are automated, with the possibility of abuse, before accepting the bot to operate on any website, the owner/programmer should demonstrate that the bot is harmless, useful, not a server hog.
Deploying bots on all website can be challenging as the level of detail of observation available and the nature of the programming interface for extending the underlying site or browser to implement monitoring may not be available or feasible. Nonetheless, bots are a way to enhance matching at a relatively low cost.
Add-ons
An Add-On is a browser extension that is installed in a browser. These Add-ons can monitor a participant’s behavior across multiple sites. Using an Add-on, researchers can access their cross-site behavior over an extended period, creating a large-scale longitudinal panel to facilitate data collection and intervention.
Add-on is important when the experiment need activity data across multiple sites. The benefits are demonstrated in a study conducted in US related to sending balanced political news articles to its users from across different sites.
Also using an Add-on, the intervention can be done or seen in real time, unlike an email where the email is sent, and we wait for the participant to respond.
Design Choices
1. Access and degree of control
To avoid ethical conflicts, the degree of experimenter control is very important in conducting any experiment.
a. Experimenter-as-user involves minimal or no collaboration with the site owners b. A site with a public interface is another option that allows for substantial experimenter control c. A collaborative relationship with a site owner is another choice that can provide a fair amount of data and control d. owning your own site is the option that gives the experimenter the most control and flexibility in the experimental design and data collection
2. Recruiting, informed consent and the IRB
One of the minimum requirement of a good recruitment is to ensure that the ethical standards are met in terms of recruiting and treating the participants. Online recruitment uses two types of subject recruitment. In general, researcher who plan to run online field experiments should go through the IRB process, to have a disinterested third party evaluate the ethical aspect of the proposed experiment, even though the process might not be able to screen out all.
unethical studies
- Natural selection
- Sampling
Natural selection: As the name suggests, the aim is to study the participants in their natural environment and hence no consent is taken from the participants as awareness can affect the behaviour of participants.
Sampling: The experimenter purchase or access a pool of database and then select the participants based on his/her requirement. The researcher may divide them into different samples based on his requirement of design. He may even collect the users on one site and redirect them to his/her site to continue with the experiment.
3. Identification and authentication
Many online studies continue over time. Hence it is important for a researcher to establish the identity of individual throughout the cycle and maintain the authenticity of user and his data.
Identification require a user to have a fixed login name and the authentication to ensure that he/she is an actual person and not an automated bot. But having user to identify and authenticate always like asking for user to login to browse news websites like CNN, New York times etc. can act counterproductive and hence the research should strike a tradeoff between authentication and ease-of-access.
There are methods available that can track users without logins the major of those are:
a. Session Tracking: session tracking on a web server can identify a sequence of user actions within a session, but not across sessions. b. IP Address: IP addresses can be used to track a user across multiple sessions originating from the same computer. However, they cannot follow a user from computer to computer. c. Cookies: Cookies are small files that a website can ask a user’s web browser to store on the user’s computer and deliver later. Cookies can identify a user even if her IP address changes, but not if a user moves to a different computer or browser, or chooses to reject cookies
4. Control Group
Control group forms an important ingredient in any experiment. Mainly in many online experiments, at least two different control groups are maintained to obtain better results. The control group gets no treatment or a different treatment which doesn’t contain the stimulus being studied. This approach is referred to as placebo
in medical experiments thereby enabling to isolate the effect of stimulus more correctly. To be effective, this control needs to be selected from the group of recruits, volunteers, or other eligible subjects.
Out of Scope of study
: Internal projects/ experiments done by major IT companies which are not available in public domain
Natural Experiments in the health sector
A natural experiment is an empirical study in which individuals (or clusters of individuals) exposed to the experimental and control conditions are determined by nature or by other factors outside the control of the investigators; yet, the process governing the exposures arguably resembles random assignment. Natural experimentation methodology is used whenever controlled experiments are difficult to implement or unethical. E.g. Evaluating the health and economic impacts of ionizing radiation in people living near Hiroshima at the time of the atomic blast. A few prominent reasons to perform a natural experiment are
- widening the range of interventions that can be usefully be evaluated and
- to encourage a rigorous and imaginative approach to the use of observational data, to evaluate interventions that should allow stronger conclusions about the cause.
As mathematical representation:
Independent variable (X) is not a planned intervention to influence the outcome (Y)
{Y ~ F(X)}
Natural experiments might need few parameters to be satisfied preemptively for a viable and an effective design of the experiment. To name a few, reasonable expectation that the intervention will have significant effect, availability of relevant data of the appropriate population and the intervention has the potential of replication, generalizability, and scalability. Natural experiments have popularity of being specified very vaguely and hence, following a format is beneficial for later researchers.
Natural experiments have also been implemented on internet for analyzing several scenarios.
Wikipedia team assigned editing awards or “barnstars” to a subset of the 1% most productive Wikipedia contributors. Comparison with the control group shows that receiving a barnstar increases productivity by 60% and makes contributors six times more likely to receive additional barnstars from other community members, revealing that informal rewards significantly impact individual effort.
Challenges for many online communities is increasing members’ contributions over time. Prior studies on peer feedback in online communities have suggested its impact on contribution but have been limited by their correlation nature. Field experiment to test the effects of positive, negative, directive and social feedback on members’ contribution and design effective feedback systems.
Natural experiments in health sector - understandings and conclusions
- Good working relationships between researchers and organizational policy makers and flexible forms of research funding are necessary to exploit the opportunities generated by policy changes
- Research effort should focus on important but answerable questions, accepting that some interesting questions may be genuinely intractable, and taking a pragmatic approach based on combinations of methods, careful testing of assumptions and transparent reporting
- Given the difficulty of eliminating bias in non-randomized studies, quantitative estimates of bias should become a standard feature of reporting.
- A prospective register of natural experimental studies, as has already been suggested in the case of smoke-free legislation and for public health interventions generally, would be a major step forward
- The case studies we have presented illustrate the crucial role of routinely-collected data, either via administrative systems or long-running population surveys
- Building up experience of these promising but lesser used methods, to determine whether and in what circumstances their theoretical advantages and disadvantages matter in practice, is another key to future progress