Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 Unported License.

Thursday, June 6, 2013

Simulating ICSE marks (Hacking into the Indian Education System)


Ref: http://deedy.quora.com/Hacking-into-the-Indian-Education-System?srid=3THA&share=1

I read article above and could not resist trying to simulate the behavior. The spike problem pointed out in the article is similar to one that happens with Hash functions. If the grader grade in certain increments, e.g. 0, 2, 4, 8, 10, or 0, 5, 10, it is possible to replicate the plots.

The plots below are generated assuming there is are internal grades for some project and a number of questions with some grades per question. I also assume that the grader grades in some increments, e.g. e.g. 0, 2, 4, 8, 10, or 0, 5, 10. I generated random samples for 100,000 students for both internal and external marks. The source code is at the end. The first set of plots is without any bias, i.e. the grader was not biased to give good or bad grades. The second set of plots assumes three different types of graders, each one is either not biased (0), biased to give good marks (positive bias) or biased to give bad marks (negative bias).

Without Bias: We can evidence of spikes in almost all cases below, except the last one.







With Bias: With bias, we can still see spikes and also see skew, and bimodal distributions.






Conclusion: It is posible that ICSE graders are grading questions in some common increments and they are not necessarily rigging the grades.


Python Source Code:

With Bias:
from __future__ import division

import math
import numpy
import pylab


def marks2(n=100, internal_marks=[0, 10, 20],

           per_question_marks=[0, 2, 4, 8, 10], total=100, figname=None,
           bias1=0, bias2=0, bias3=0):

    m1 = marks(n, internal_marks, per_question_marks, total, figname, bias1)

    m2 = marks(n, internal_marks, per_question_marks, total, figname, bias2)

    m3 = marks(n, internal_marks, per_question_marks, total, figname, bias3)

    m = numpy.concatenate([m1, m2, m3])

    x = numpy.arange(total+2)

    y, _ = numpy.histogram(m, x)

    print "Mean=%s, stderr=%s min=%s max=%s median=%s" % \

        (m.mean(), m.std() / math.sqrt(len(m)), m.min(), m.max(), numpy.median(m))

    pylab.plot(x[:-1], y, 'r-o')

    pylab.grid()

    pylab.title("Internal=%s, External=%s,\nn=%s, bias1=%d, bias2=%d bias3=%d" % (str(internal_marks), str(per_question_marks), n, bias1, bias2, bias3))

    pylab.xlabel("marks")

    pylab.draw()


def marks(n=100, internal_marks=[0, 10, 20],

          per_question_marks=[0, 2, 4, 8, 10], total=100, figname=None, bias=0):

    m = numpy.zeros(n)

    for i in range(n):

        m[i] = marks_one_student(internal_marks, per_question_marks, total, bias)

    return m

def pos(n, bias):

    return numpy.max([0, numpy.min([numpy.random.randint(n) + bias, n-1])])



def marks_one_student(internal_marks, per_question_marks, total, bias):

    nquestions = (total - numpy.max(internal_marks)) // numpy.max(per_question_marks)

    assert numpy.max(internal_marks) + \

        nquestions * numpy.max(per_question_marks) == total

    marks = internal_marks[pos(len(internal_marks), bias)]

    for i in range(nquestions):

        marks += per_question_marks[pos(len(per_question_marks), bias)]


    return marks

Without Bias:

from __future__ import division



import math

import numpy

import pylab



def marks(n=100, internal_marks=[0, 10, 20],

          per_question_marks=[0, 2, 4, 8, 10], total=100, figname=None):

    m = numpy.zeros(n)

    for i in range(n):

        m[i] = marks_one_student(internal_marks, per_question_marks, total)



    print "Mean=%s, stderr=%s min=%s max=%s median=%s" % \

        (m.mean(), m.std() / math.sqrt(len(m)), m.min(), m.max(), numpy.median(m))

    x = numpy.arange(total+2)

    y, _ = numpy.histogram(m, x)



    print len(x), x

    print len(y), y



    pylab.plot(x[:-1], y, 'r-o')

    pylab.grid()

    pylab.title("Internal=%s, External=%s,\nn=%s, Mean=%.1f, Median=%.1f" % (str(internal_marks), str(per_question_marks), n, m.mean(), numpy.median(m)))

    pylab.xlabel("marks")

    pylab.draw()



    if figname:

        pylab.savefig(figname)



    return m





def marks_one_student(internal_marks, per_question_marks, total):

    nquestions = (total - numpy.max(internal_marks)) // numpy.max(per_question_marks)

    assert numpy.max(internal_marks) + \

        nquestions * numpy.max(per_question_marks) == total



    marks = internal_marks[numpy.random.randint(0, len(internal_marks))]



    for i in range(nquestions):

        marks += per_question_marks[numpy.random.randint(0, len(per_question_marks))]



    return marks


Wednesday, May 8, 2013

Min and Max of 2 numbers


Problem: Find minimum and maximum of given 2 numbers.

Solution in Python:

def max(n, m):
    """
    Author: Mayur P Srivastava
    """

    if n >= m:
        return n
    return m

def min(n, m):
    """
    Author: Mayur P Srivastava
    """

    if n < m:
        return n
    return m


Concepts Learned: Logical conditions.

Cricket net run rate


Problem: Calculate net run rate for a cricket match, given runs scored by team A (batted first), overs played by team A, runs scored by team B, overs played by team B, winning team name, whether team A was bowled out, whether team B was bowled out, total number of overs in one innings.

Solution in Python:


from __future__ import division

import math

def net_run_rate(runsA, oversA, runsB, oversB,
                 winning_team,
                 bowled_outA=False,
                 bowled_outB=False,
                 total_overs=50,
                 eps=1e-8):
    """
    Author: Mayur P Srivastava
    Reference: http://en.wikipedia.org/wiki/Net_run_rate
    """

    assert winning_team in ['AB', 'A', 'B']

    if winning_team == 'AB':
        return 0.0, 0.0

    oversA = parse_overs(oversA)
    oversB = parse_overs(oversB)

    if abs(oversA - total_overs) > eps:
        bowled_outA = True

    if abs(oversB - total_overs) > eps and winning_team == 'A':
        bowled_outB = True

    if bowled_outA:
        oversA = total_overs
    if bowled_outB:
        oversB = total_overs

    rrA = runsA / oversA
    rrB = runsB / oversB
    
    if winning_team == 'A':
        nrrA = rrA - rrB
        nrrB = -nrrA
    else:
        nrrB = rrB - rrA
        nrrA = -nrrB

    return (nrrA, nrrB), (runsA, runsB), (oversA, oversB)


def parse_overs(o):
    completed_overs = math.floor(o)

    balls = math.floor(0.5 + 10 * (o - completed_overs))
    assert balls >= 0 and balls < 6
    
    return completed_overs + balls / 6.0



Concepts Learned: Maths


Cricket run rate


Problem: Compute current run rate and required run rate for a cricket game. Given: current score, target score, number of overs bowled, total number of overs.

Solution in Python:


from __future__ import division

import math

def calculate_run_rates(current_score, target_score, current_overs, total_overs):
    """

    Author: Mayur P Srivastava

    In overs, fraction part represents number of balls,
    e.g. 5.1, 5.2, 5.3, 5.4, 5.5, 6.0
    """

    current_overs = parse_overs(current_overs)
    total_overs   = parse_overs(total_overs)

    if current_overs > 0:
        current_rr = current_score / current_overs
    else:
        current_rr = 0

    remaining_overs = total_overs - current_overs
    runs_to_win     = target_score - current_score + 1

    required_rr = runs_to_win / remaining_overs


def parse_overs(o):
    completed_overs = math.floor(o)

    balls = math.floor(0.5 + 10 * (o - completed_overs))
    assert balls >= 0 and balls < 6
    
    return completed_overs + balls / 6.0

Concepts Learned: Maths

Matrix Addition


Problem: Add the given 2 matrixes.

Solution in Python:


def add(A, B):
    """
    Author: Mayur P Srivastava
    """

    m1, n1 = shape(A)
    m2, n2 = shape(B)

    if not can_add(m1, n1, m2, n2):
        return None

    C = create_matrix(m1, n2)

    for i in range(m1):
        for j in range(n2):
            C[i][j] = A[i][j] + B[i][j]

    return C

def shape(A):
    m = len(A)   
    n = 0
    for row in A:
        n2 = len(row)
        if n == 0:
            n = n2
        elif n != n2:
            assert False

    return m, n

def can_add(m1, n1, m2, n2):
    return m1 == m2 and n1 == n2

def create_matrix(m, n, value=0):
    matrix = []
    for i in range(m):
        row = []
        for j in range(n):
            row.append(value)
        matrix.append(row)
    return matrix



Concepts Learned: Nested loops and Maths

Matrix Multiplication


Problem: Multiply the given 2 matrixes.

Solution in Python:

def multiply(A, B):
    """
    Author: Mayur P Srivastava
    """


    m1, n1 = shape(A)
    m2, n2 = shape(B)

    if not can_multiply(m1, n1, m2, n2):
        return None

    C = create_matrix(m1, n2)

    for i in range(m1):
        for j in range(n2):
            c = 0
            for k in range(n1):
                c += A[i][k] * B[k][j]
            C[i][j] = c

    return C

def shape(A):
    m = len(A)   
    n = 0
    for row in A:
        n2 = len(row)
        if n == 0:
            n = n2
        elif n != n2:
            assert False

    return m, n

def can_multiply(m1, n1, m2, n2):
    return n1 == m2

def create_matrix(m, n, value=0):
    matrix = []
    for i in range(m):
        row = []
        for j in range(n):
            row.append(value)
        matrix.append(row)
    return matrix

Concepts Learned: Maths

Divisibility


Problem: Check whether a given number n is divisible by another number m.

Solution in Python:

def is_divisible(n, m):
    """
    Author: Mayur P Srivastava
    """


    if m == 0:
        return False

    return n % m == 0

Concepts Learned: Maths