pROGRAMMERDATASCIENCE22
diff --git a/‎README.md
Lines changed: 7 additions & 7 deletions b/‎README.md
Lines changed: 7 additions & 7 deletions
diff --git a/‎others/hit_counter.md
Lines changed: 105 additions & 0 deletions b/‎others/hit_counter.md
Lines changed: 105 additions & 0 deletions
diff --git a/‎others/job_scheduler.md
Lines changed: 70 additions & 0 deletions b/‎others/job_scheduler.md
Lines changed: 70 additions & 0 deletions
diff --git a/‎others/kaprekar's_constant.md
Lines changed: 81 additions & 0 deletions b/‎others/kaprekar's_constant.md
Lines changed: 81 additions & 0 deletions
diff --git a/‎others/regex.md
Lines changed: 74 additions & 0 deletions b/‎others/regex.md
Lines changed: 74 additions & 0 deletions
@@ -161,10 +161,10 @@ A catalogue of data structures implementation + algorithms and coding problems a
 
 ## Others
 
-- [Hit Counter](others/hit_counter.ipynb)
-- [Job Scheduler](others/job_scheduler.ipynb)
-- [Kaprekar's constant](others/kaprekar's_constant.ipynb)
-- [Regular expression matching](others/regex.ipynb)
-- [Stable marriage problem](others/stable_marriage_problem.ipynb)
-- [Url on the browser](others/url_browser_explanation.ipynb)
-- [Word sense disambiguation](others/word_sense_disambiguation.ipynb)
+- [Hit Counter](others/hit_counter.md)
+- [Job Scheduler](others/job_scheduler.md)
+- [Kaprekar's constant](others/kaprekar's_constant.md)
+- [Regular expression matching](others/regex.md)
+- [Stable marriage problem](others/stable_marriage_problem.md)
+- [Url on the browser](others/url_browser_explanation.md)
+- [Word sense disambiguation](others/word_sense_disambiguation.md)
@@ -0,0 +1,105 @@
+## Hit Counter
+
+Design and implement a HitCounter class that keeps track of requests (or hits). It should support the following operations:
+
+- record(timestamp): records a hit that happened at timestamp
+- total(): returns the total number of hits recorded
+- range(lower, upper): returns the number of hits that occurred between timestamps lower and upper (inclusive)
+
+What if our system has limited memory?
+
+
+```python
+class HitCounter:
+    def __init__(self):
+        self.hits = []
+    
+    def record(timestamp):
+        self.hits.append(timestamp)
+    
+    def total():
+        return len(self.hits)
+    
+    def range(lower, upper):
+        count = 0
+        for hit in self.hits:
+            if lower <= hit <= upper:
+                count += 1
+        return count
+```
+
+Here record and count are constant time operations. Range takes O(N) time.
+
+One tradeoff we can make is to use a sorted list or BST to keep track of the hits. This allows range operation to take O(log N) time. We can use Python's [bisect](https://docs.python.org/3/library/bisect.html) to handle sortedness.
+
+
+
+```python
+import bisect
+
+
+class HitCounter:
+    def __init__(self):
+        self.hits = []
+        
+    def record(timestamp):
+        bisect.insort_left(self.hits, timestamp)
+    
+    def total():
+        return len(self.hits)
+    
+    def range(lower, upper):
+        low = bisect.bisect_left(self.hits, lower)
+        high = bisect.bisect_right(self.hits, upper)
+        return high - low
+```
+
+While this is time efficient, it'll still take a lot of space because we are still saving each timestamp into the list.
+
+We can sacrifice accuracy for memory by grouping timestamps into minutes or hours. We'll lose accuracy around the boarders but use upto a constant factor less space.  
+
+For our solution, we'll keep track of each group in a tuple, where the first item is a timestamp (in minutes) and the second item is the number of hits occuring within that minute. We'll sort the tuple by minute to allow record to run in O(log N) time.
+
+```
+tuple = (minute,  hits_within_this_minute)
+```
+
+
+```python
+import bisect
+from math import floor
+
+class HitCounter:
+    def __init__(self):
+        self.hits = []
+        self.counter = 0
+        
+    def record(timestamp):
+        self.counter += 1
+        
+        minute = floor(timestamp / 60)
+        
+        idx = bisect.bisect_left([hit[0] for hit in self.hits], minute)
+        
+        if idx < len(hits) and self.hits[idx][0] == minute:
+            self.hits[idx] = (minute, self.hits[idx][1] + 1)
+        else:
+            self.hits.insert(idx, (minute, 1))
+        
+    def total():
+        return self.counter
+    
+    def range(lower, higher):
+        lo = floor(lower / 60)
+        hi = floor(higher / 60)
+        lo_idx = bisect.bisect_left([hit[0] for hit in self.hits], lo)
+        hi_idx = bisect.bisect_right([hit[0] for hit in self.hits], hi)
+        
+        # sum the counts of each tuple within the range(lo, hi)
+        return sum(self.hits[i][0] for i in range(lo_idx, hi_idx))
+```
+
+
+```python
+
+```
@@ -0,0 +1,70 @@
+## Job Scheduler 
+
+Implement a job scheduler that takes in a function f and an integer N, and calls the function after N milliseconds.
+
+
+### 1st Approach
+There are many ways to do this. A more straightforward solution is to spin off a new thread on each function we want to delay, sleep for N milliseconds, then run the function.
+
+
+
+```python
+import threading
+from time import sleep
+
+class Scheduler:
+    def __init__(self):
+        pass
+    
+    def delay(self, func, n):
+        def sleep_then_call(n):
+            sleep(n / 1000)
+            func()
+        
+        t = threading.Thread(target=sleep_then_call)
+        t.start()
+```
+
+###  2nd Approach
+While this works, there's a huge problem with our logic: we spin off a new thread each time we call delay! The number of threads will easily grow as we have more functions to schedule. 
+
+We can solve this by having one dedicated thread to call functions, and storing functions we need to call in some data structure, say a list. 
+
+Then do polling to check when to run a function. We can store each function along with a unix epoch timestamp that tells when it should run. 
+
+After checking the list for any jobs that are due to run, we run them and remove them from the list.
+
+
+```python
+import threading
+from time import sleep, time
+
+class Scheduler:
+    def __init__(self):
+        self.functions = []  # saves tuple of (function, time-to-run-it)
+        t = threading.Thread(target=self.poll)
+        t.start()
+        
+    def poll(self):
+        while True:
+            now = time() * 1000.  # change from sec to ms
+            for function, due in self.functions:
+                if now > due:
+                    function()
+            self.functions = [(function, due) for (function, due) in self.functions if due > now]
+            sleep(0.01)
+            
+    def delay(self, function, n):
+        self.functions.append((function, time() * 1000 + n))
+```
+
+You can go further by doing:
+- Extend the scheduler to allow functions with variables
+- Use a heap instead of a list to keep track of the next job to run more efficiently
+- Come up with a way to get a due function, say a condition variable instead of polling
+- Use a threadpool to run more than one thread without the chance of starvation (when one thread is not able to run because of another running thread) 
+
+
+```python
+
+```
@@ -0,0 +1,81 @@
+## Kaprekar's constant
+
+The number 6174 is known as Kaprekar's contant, after the mathematician who discovered an associated property: for all four-digit numbers with at least two distinct digits, repeatedly applying a simple procedure eventually results in this value. 
+
+The procedure is as follows:
+
+- For a given input x, create two new numbers that consist of the digits in x in ascending and descending order.
+- Subtract the smaller number from the larger number.
+
+For example, this algorithm terminates in three steps when starting from 1234:
+
+```js
+4321 - 1234 = 3087
+8730 - 0378 = 8352
+8532 - 2358 = 6174
+```
+Write a function that returns how many steps this will take for a given input N.
+
+## Solution
+To solve this imperatively, we can implement a while loop that continually runs the procedure described above until obtaining the number 6174. 
+
+For each iteration of the loop we will increment a counter for the number of steps, and return this value at the end.
+
+We also use a helper function that prepends zeros if necessary so that the number always remains four digits long, before creating the ascending and descending integers.
+
+
+```python
+def get_digits(n):
+    digits = str(n)
+    if len(digits) == 4:
+        return digits
+    else:
+        return '0' * (4 - len(digits)) + digits
+
+def count_steps(n):
+    count = 0
+    while n != 6174:
+        n = int(''.join(sorted(get_digits(n), reverse=True))) - int(''.join(sorted(get_digits(n))))
+        count += 1
+    return count
+```
+
+
+```python
+count_steps(12)
+```
+
+
+
+
+    3
+
+
+
+
+```python
+### Recursive solution
+def count_steps(n, steps=0):
+    if n == 6174:
+        return steps
+    num = int(''.join(sorted(get_digits(n), reverse=True))) - int(''.join(sorted(get_digits(n))))
+    
+    return count_steps(num, steps + 1)
+```
+
+
+```python
+count_steps(1234)
+```
+
+
+
+
+    3
+
+
+
+
+```python
+
+```
@@ -0,0 +1,74 @@
+## Implement Regular Expression Matching
+
+Implement regular expression matching with the following special characters:
+
+. (period) which matches any single character
+* (asterisk) which matches zero or more of the preceding element
+That is, implement a function that takes in a string and a valid regular expression and returns whether or not the string matches the regular expression.
+
+For example, given the regular expression "ra." and the string "ray", your function should return true. The same regular expression on the string "raymond" should return false.
+
+Given the regular expression ".*at" and the string "chat", your function should return true. The same regular expression on the string "chats" should return false.
+
+
+```python
+### Approach
+
+# helper function that check first matching character
+
+# base case: if r == '', return s == ''  //  s = "123" .. recursive(s, r)
+# Otherwise if the first thing in r is not an asterisk(*), then match the first character of both r and s. If they match, return match(r[1:], s[1:]). If they don't return false.
+# If the first things in r is an asterisk, then 
+
+def matches_first_char(s, r):
+    return s[0] == r[0] or (r[0] == "." and len(s) > 0)
+
+def matches(s, r):
+    # base case 
+    if r == "":
+        return s == ""
+    
+    # The first char in the regex r is not proceeded by a *
+    if len(r) == 1 or r[0] != "*":
+        if matches_first_char(s, r):
+            return matches(s[1:], r[1:])
+        else:
+            return False
+        
+    # The first char in r is proceeded by *
+    if matches(s, r[2:]):
+        # Try zero length
+        return True
+    
+    # If it doesn't match staight away, try globbing until
+    # the first character of the string doesn't match anymore.
+    i = 0
+    while matches_first_char(s[i:], r):
+        if matches(s[i+1:], r[2:]):
+            return True
+        i += 1
+    return False
+```
+
+
+```python
+r = "tx."
+s = "txt"
+matches(s, r)
+```
+
+
+
+
+    True
+
+
+
+This takes **O(len(s) * len(r))** time and space, since we potentially need to iterate over each suffix substring again for each character.
+
+Fun fact: Stephen Kleene introduced the * operator in regular expressions and as such, it is sometimes referred to as the Kleene star.
+
+
+```python
+
+```