Compare commits
10 commits
51345db93f
...
245f7a941c
Author | SHA1 | Date | |
---|---|---|---|
245f7a941c | |||
d88de2ec0f | |||
89d6053c9d | |||
ed9875c54e | |||
f392c8f897 | |||
f36f5be5d0 | |||
2173d6f7fa | |||
f1276154ac | |||
31ed46d6d9 | |||
45d7c671c7 |
3 changed files with 163 additions and 34 deletions
32
README.org
32
README.org
|
@ -21,15 +21,32 @@ Right now there is no automated way to generate your feed url but making one by
|
|||
**** URL parameters
|
||||
|
||||
| Param name | Default value | Can have multiple? | Mandatory? | Short description |
|
||||
|------------+---------------+--------------------+-----------------+--------------------------------------------------------------------------------------------------|
|
||||
| board | "mlp" | No | No | Which board to generate feed for, *ONLY* /mlp/ is supported |
|
||||
| q | "" | Yes | Yes (1 or more) | This string is used to filter threads according to their titles |
|
||||
|------------+---------------+--------------------+----------------------+--------------------------------------------------------------------------------------------------|
|
||||
| board | "mlp" | No | No (not implemented) | Which board to generate feed for, *ONLY* /mlp/ is supported |
|
||||
| q | nil | Yes | Yes (1 or more) | This string is used to filter threads according to their titles |
|
||||
| chod | 94 | No | No | CHanceOfDeath - will include thread in the feed if it's chance to death i > chod |
|
||||
| repeat | ~false~ | No | No | Whether to make new notification on every server update even when thread doesnt have higher chod |
|
||||
| recreate | ~bool~ | Not implemented | Not implemented | Whether to notify when creation of new thread matching querry is detected (uses 4chans RSS) |
|
||||
|
||||
**** How to create URL
|
||||
|
||||
- Standart rules of URLs apply, if you know how to pass params in URL to any website, you don't even have to read this
|
||||
- Open some text editor
|
||||
- Paste in default URL: ~https://tools.treebrary.org/thread-watcher/feed.xml?~ (you can use plain HTTP if you want to)
|
||||
- Now you can append any of the supported parameters (which you can find in the above table):
|
||||
- For example if we want to be informed about threds with "cute" in their title
|
||||
- ~q=cute~ which would make ~https://tools.treebrary.org/thread-watcher/feed.xml?q=cute~
|
||||
- If you want more than one param, separate with ~&~, for example:
|
||||
- ~q=cute~ and ~q=pretty~ ~https://tools.treebrary.org/thread-watcher/feed.xml?q=cute&q=pretty~
|
||||
- Same is true for when you also want to specify ChoD
|
||||
- ~https://tools.treebrary.org/thread-watcher/feed.xml?q=cute&q=pretty&chod=98~
|
||||
- This will only notify you about threads that:
|
||||
- Have ~cute~ or ~pretty~ in their title
|
||||
- Are in the lowest 98% part of catalog (it's on position ~147/150 e.g. 3 threads before being bumped off)
|
||||
- Note that ~//~ are not special characters ~q=/general/~ will work as expected and match thread with "/general/" in it's title
|
||||
- Also note that regex is *NOT* supported for now, so something like ~q=rainbow*~ will only match threads with "rainbow[asterisk]"
|
||||
in their title
|
||||
|
||||
*** Generating URL interactively
|
||||
|
||||
Coming soon
|
||||
|
@ -42,8 +59,15 @@ This is an experimental project. There are several limitations:
|
|||
|
||||
** Feature set
|
||||
|
||||
- Planned/finnished features
|
||||
- Planned/finnished features [7%]
|
||||
- [X] [DONE] Super basic features done (feed, query, repeat)
|
||||
- [ ] No params request should redirect to url generator or (for now) documentation
|
||||
- [ ] Have proper sorting - The most likely to die threads first (stačí dát reverse u posledního vstupu filtru?)
|
||||
- [ ] Config file instead of hardcoding config values
|
||||
- [ ] Include time of latest data fetch
|
||||
- [ ] Make threads have preview images taken from the actuall thread OP
|
||||
- [ ] Show which query matched the thread you were notified of
|
||||
- [ ] Option to include advanced HTML formating of text (different color text for ChoD etc)
|
||||
- [ ] Support notification on watched thread re-creation after it died
|
||||
- [ ] Support notification for thread death
|
||||
- [ ] Support multiple boards at once
|
||||
|
|
|
@ -1,29 +1,65 @@
|
|||
;; Copyright (C) 2023 Felisp
|
||||
;;
|
||||
;; This program is free software: you can redistribute it and/or modify
|
||||
;; it under the terms of the GNU Affero General Public License as published by
|
||||
;; the Free Software Foundation, version 3 of the License.
|
||||
;;
|
||||
;; This program is distributed in the hope that it will be useful,
|
||||
;; but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
;; GNU Affero General Public License for more details.
|
||||
;;
|
||||
;; You should have received a copy of the GNU Affero General Public License
|
||||
;; along with this program. If not, see <https://www.gnu.org/licenses/>.
|
||||
|
||||
(ns rss-thread-watch.core
|
||||
(:require [ring.adapter.jetty :as jetty])
|
||||
(:require [ring.adapter.jetty :as jetty]
|
||||
[ring.middleware.params :as rp]
|
||||
[rss-thread-watch.watcher :as watcher]
|
||||
[rss-thread-watch.feed-generator :as feed])
|
||||
(:gen-class))
|
||||
|
||||
;; Internal default config
|
||||
(def CONFIG
|
||||
"Internal default config"
|
||||
{:target "https://api.4chan.org/mlp/catalog.json" ;Where to download catalog from
|
||||
:start-index-at 50.0 ;We will search all threads that are lower in catalog than this % value
|
||||
:starting-page 7 ;only monitor threads from this from this page and up
|
||||
:refresh-delay (* 60 5) ;Redownload catalog every 5 mins
|
||||
:port 6969 ;Liston on 6969
|
||||
:port 6969 ;Listen on 6969
|
||||
})
|
||||
|
||||
(defn set-interval
|
||||
"Calls function every ms"
|
||||
[callback ms]
|
||||
(future (while true (do (try
|
||||
(callback)
|
||||
(println "Recached")
|
||||
(catch Exception e
|
||||
(binding [*out* *err*]
|
||||
(println "Error while updating cache: " e ", retrying in 5 minutes"))))
|
||||
(Thread/sleep ms)))))
|
||||
|
||||
(defn -main
|
||||
"Entry point, starts webserver"
|
||||
[& args]
|
||||
())
|
||||
|
||||
(defn handler [rqst]
|
||||
{:status 404
|
||||
:header {"Content-Type" "text/html"}
|
||||
:body "No poines here ^:("})
|
||||
(println "Starting on port: " (:port CONFIG)
|
||||
"\nGonna recache every: " (:refresh-delay CONFIG) "s")
|
||||
(set-interval (fn []
|
||||
(println "Starting cache update")
|
||||
(watcher/update-thread-cache! (:target CONFIG) (:starting-page CONFIG)))
|
||||
(* 1000 (:refresh-delay CONFIG)))
|
||||
(jetty/run-jetty (rp/wrap-params feed/http-handler) {:port (:port CONFIG)
|
||||
:join? true}))
|
||||
|
||||
;; Docs: https://github.com/ring-clojure/ring/wiki/Getting-Started
|
||||
(defn repl-main
|
||||
"Development entry point"
|
||||
[]
|
||||
(jetty/run-jetty handler {:port (:port CONFIG)
|
||||
(jetty/run-jetty (rp/wrap-params #'feed/http-handler)
|
||||
{:port (:port CONFIG)
|
||||
;; Dont block REPL thread
|
||||
:join? false}))
|
||||
;; (repl-main)
|
||||
;; Single cache update for repl
|
||||
;; (watcher/update-thread-cache! (:target CONFIG) (:starting-page CONFIG))
|
||||
;; (watcher/update-thread-cache! "/home/michal/Zdrojaky/Clojure/rss-thread-watch/resources/catalog-pts9.json" (:starting-page CONFIG))
|
||||
|
|
|
@ -13,11 +13,11 @@
|
|||
;; along with this program. If not, see <https://www.gnu.org/licenses/>.
|
||||
|
||||
(ns rss-thread-watch.feed-generator
|
||||
"Generates feed for requests"
|
||||
"Generates feeds for requests"
|
||||
(:require [ring.middleware.params :as rp]
|
||||
[clj-rss.core :as rss]
|
||||
[clojure.string :as s]
|
||||
[rss-thread-watch.watcher :as cache])
|
||||
[rss-thread-watch.watcher :as watcher])
|
||||
(:gen-class))
|
||||
|
||||
|
||||
|
@ -34,7 +34,7 @@
|
|||
|
||||
This is done by always making new GUID - (concat thread-number UNIX-time-of-data-update)"
|
||||
[thread time]
|
||||
(assoc thread guid (str (:no thread)
|
||||
(assoc thread :guid (str (:no thread)
|
||||
"-"
|
||||
time)))
|
||||
|
||||
|
@ -45,20 +45,23 @@
|
|||
[thread]
|
||||
(assoc thread :guid (format "%d-%.2f"
|
||||
(:no thread)
|
||||
:chod thread)))
|
||||
(:chod thread))))
|
||||
|
||||
(defn filter-chod-posts
|
||||
"Return list of all threads with equal or higher ChoD than requested"
|
||||
[query-vec chod-treshold repeat?]
|
||||
[query-vec chod-treshold repeat? cache]
|
||||
(let [time-of-generation (System/currentTimeMillis)
|
||||
guid-fn (if repeat? (fn [x] (new-guid-always x time-of-generation))
|
||||
update-only-guid)
|
||||
cache-start-index (first (indices (fn [x] (>= x chod-treshold))
|
||||
@cache/chod-threads-cache))
|
||||
cache-start-index (first (indices (fn [x] (>= (:chod x) chod-treshold))
|
||||
cache))
|
||||
;; So we don't have to search thru everything we have cached
|
||||
needed-cache-part (subvec @cache/chod-threads-cache cache-start-index)
|
||||
needed-cache-part (subvec cache cache-start-index) ;Todo: remove that ugly global reference
|
||||
actuall-matches (keep (fn [t]
|
||||
(let [title (:title t)]
|
||||
;; Todo: Man, wouldn't it be cool to know which querry matched the thread?
|
||||
;; Would be so much easier for user to figure out why is it showing
|
||||
;; and it would solve the problem of super long titles (or OPs instead of titles)
|
||||
(when (some (fn [querry]
|
||||
(s/includes? title querry))
|
||||
query-vec)
|
||||
|
@ -67,6 +70,28 @@
|
|||
;; Finally generate and append GUIDs
|
||||
(map guid-fn actuall-matches)))
|
||||
|
||||
(defn thread-to-rss-item
|
||||
"If I wasnt retarded I could have made the cached version look like
|
||||
rss item already but what can you do. I'll refactor I promise, I just need this done ASAP" ;Todo: do what the docstring says
|
||||
[t]
|
||||
(let [link-url (str "https://boards.4chan.org/mlp/thread/" (:no t))] ; jesus, well I said only /mlp/ is supported now so fuck it
|
||||
{:title (format "%.2f%% - %s" (:chod t) (:title t))
|
||||
;; :url link-url <- this is supposed to be for images according to: https://cyber.harvard.edu/rss/rss.html
|
||||
:description (format "The thread: '%s' has %.2f%% chance of dying" (:title t) (:chod t))
|
||||
:link link-url
|
||||
:guid (:guid t)}))
|
||||
|
||||
(defn generate-feed
|
||||
"Generates feed from matching items"
|
||||
[query-vec chod-treshold repeat? cache]
|
||||
(let [items (filter-chod-posts query-vec chod-treshold repeat? cache)
|
||||
head {:title "RSS Thread watcher v0.1"
|
||||
:link "https://tools.treebrary.org/thread-watcher/feed.xml"
|
||||
:feed-url "https://tools.treebrary.org/thread-watcher/feed.xml"
|
||||
:description "RSS based thread watcher"}
|
||||
body (map thread-to-rss-item items)]
|
||||
(rss/channel-xml head body)))
|
||||
|
||||
(defn http-handler
|
||||
"Handles HTTP requests, returns generated feed
|
||||
|
||||
|
@ -74,7 +99,51 @@
|
|||
rss-thread-watch.watcher.chod-threads-cache
|
||||
rss-thread-watch.core.CONFIG"
|
||||
[rqst]
|
||||
(try (let [{{chod "chod" :or {chod 60}
|
||||
:as prms} :params
|
||||
uri :uri} rqst
|
||||
queries (if (vector? (prms "q")) (prms "q") [(prms "q")]) ; to always return vector
|
||||
repeat? (prms "repeat")
|
||||
real-chod (try ;If we can't parse number from give chod param, just use 94
|
||||
(if (or (vector? chod)
|
||||
(< (Integer/parseInt chod) 60)) ; Never accept chod lower that 60 TODO: don't hardcode this
|
||||
94 (Integer/parseInt chod))
|
||||
(catch Exception e
|
||||
94))
|
||||
cache @watcher/chod-threads-cache]
|
||||
;; (println "RCVD: " rqst)
|
||||
(println rqst)
|
||||
;; ====== Errors =====
|
||||
;; Something other than feed.xml requested
|
||||
(when-not (s/ends-with? uri "feed.xml")
|
||||
(throw (ex-info "404" {:status 404
|
||||
:header {"Content-Type" "text/plain"}
|
||||
:body "404 This server has nothing but /feed.xml"})))
|
||||
;; No querry specified - don't know what to search for
|
||||
(when-not (prms "q")
|
||||
(throw (ex-info "400" {:status 400
|
||||
:header {"Content-Type" "text/plain"}
|
||||
:body (str "400 You MUST specify query with one OR more'q=searchTerm' url parameter(s)\n\n\n"
|
||||
"Exmple: '/feed.xml?q=pony&q=IWTCIRD' will show in your feed all threads with 'pony' or 'IWTCIRD'"
|
||||
" in their title that are about to die.")})))
|
||||
;; Whether cache has been generated yet
|
||||
(when (empty? cache)
|
||||
(throw (ex-info "503" {:status 503
|
||||
:header {"Content-Type" "text/plain"}
|
||||
:body (str "503 Service Unavailable\n"
|
||||
"Cache is empty, cannot generate feed. Try again later, it may work.")})))
|
||||
;; ==== Everything good ====
|
||||
{:status 200
|
||||
:header {"Content-Type" "text/html"}
|
||||
:body "All pony here ^:)"})
|
||||
|
||||
;; There shouldn't be any problems with this mime type but if there are
|
||||
;; replace with "text/xml", or even better, get RSS reader that is not utter shit
|
||||
:header {"Content-Type" "application/rss+xml"}
|
||||
:body (generate-feed queries real-chod repeat? cache)})
|
||||
(catch Exception e
|
||||
;; Ex-info has been crafted to match HTTP response body so we can send it
|
||||
(if-let [caught (ex-data e)] ;Tam bude ale vždycky ex-data myslím, to chce čekovat jestli t obsahuje nějaký klíč (body? at nemusí být nějaký extra)
|
||||
caught ;We have custom crafted error
|
||||
{:status 500 ;Something else fucked up, we print what happened
|
||||
:header {"Content-Type" "text/plain"}
|
||||
:body (str "500 - Something fucked up while generating feed, If you decide to report it, please include url adress you used:\n"
|
||||
(ex-cause e) "\n"
|
||||
e)}))))
|
||||
|
|
Loading…
Reference in a new issue