I have a supervised binary classification problem. I tuned an xgboost model on the training set and achieved a reasonably high accuracy on the test set. Now I want to interpret the results of the model. I used the SHAP library to interpret the results and for the most part, they are consistent with what I would expect. However, there is one feature that, if we average over all shap values, is ranked as 7th most important and I would …
When do we use one or the other? My use case: I want to evaluate a linear space to see how good retrieval results are. I have a set of data X (m x n) and some weights W (m x 1). I want to measure the nearest neighbour retrieval performance on W'X with a ground truth value Y. This is a continuous value, so I can't use simple precision/recall. If I use rank correlation, I will find the correlation …
I would like to allow contributors to upload a file. So I use this recommended plugin WP Role Editor After activated, I go to the plugin from User > User Role Editor and then select contributor in the selection dropdown. After that I put a check on upload_files and hit update. Then, I login with contributor account to test uploading a file. Great, I see the media upload button but when I click to upload a file, I get this …
When we are doing weighted least squares how do we find the weights? Where ever I see tutorials are just using $w_i = \frac{1}{(sigma)i^2}$ and doing it with basic data. But I want to know how to find the weights for real data. Is it always the inverse of the square of variance?
I am trying to implement a repeater / post object so I can select posts that I want to displaying on the sidebar. I am using this code: <?php while ( have_rows('top_posts_repeater')) : the_row(); // loop through the repeater fields ?> <?php // set up post object $post_object = get_sub_field('selection'); if( $post_object ) : $post = $post_object; setup_postdata($post); ?> <article class="your-post"> <?php the_title(); ?> <?php the_post_thumbnail(); ?> <?php // whatever post stuff you want goes here ?> </article> <?php wp_reset_postdata(); …
I'm working on a regression problem with a few high-cardinality categorical features (Forecasting different items with a single model). Someone suggested to use target-encoding (mean/median of the target of each item) together with xgboost. While I understand how this new feature would improve a linear model (or GMM'S in general) I do not understand how this approach would fit into a tree-based model (Regression Trees, Random Forest, Boosting). Given the feature is used for splitting, items with a mean below …
Here is my code; file_name = ['0a57bd3e-e558-4534-8315-4b0bd53df9d8.jpeg', '20d721fc-c443-49b2-aece-fd760f13ff7e.jpeg'] img_id = {} images = [] for e, i in enumerate(range(len(file_name))): img_id['file_name'] = file_name[e] images.append(img_id) print(images) The output is; [{'file_name': '20d721fc-c443-49b2-aece-fd760f13ff7e.jpeg'}, {'file_name': '20d721fc-c443-49b2-aece-fd760f13ff7e.jpeg'}] I want it to be; [{'file_name': '0a57bd3e-e558-4534-8315-4b0bd53df9d8.jpeg'}, {'file_name': '20d721fc-c443-49b2-aece-fd760f13ff7e.jpeg'}] I don't know, why it is saves only the last file name in the dictionary?
I am not able to find some list of main statistics models. Is is possible to devide statistics models into categories as supervised (regression,classification) x unsupervised (clustering) or is it something which is used in filed of machine learning but not for categorizing statistics model? Thank you
I like to know how I can remove these individually. <style id='wp-block-separator-inline-css'> @charset "UTF-8";.wp-block-separator{border-bottom:1px soli ... </style> WP has a feature to add inline JS and (and also CSS I think) to registered scripts/styles. So dequeue should work, but it does not. add_action( 'wp_enqueue_scripts', __NAMESPACE__ . '\action_wp_enqueue_scripts', 99 ); function action_wp_enqueue_scripts() { wp_dequeue_style( 'wp-block-navigation' ); // Comes from a file and works wp_dequeue_style( 'wp-block-post-comments-form' ); // Comes from a file and works wp_dequeue_style( 'wp-block-seperator' ); // Does not work } …
I have a problem similar to what is on the title but not the same. The problem on the title allows me to explain the dynamics of my need. I have to determine what the optimal value is for a variable called QUOTA or LIMIT for a credit card. The goal of the model is to allow me to minimize the probability of default, given this variable and others that characterize my costumer. What is the best way to determine …
Sorry if this is a simple/daft question but I'm still getting to grips with how WordPress search functions. I want to completely replace the standard search within my template with a custom search that only queries a certain custom post type and its meta fields. I have a search form which does this and search.php which returns the correct data. However, the search will not function unless I include a input field named 's' and it is not empty. I …
A reproducible example with a small bit of R code is available in this stackoverflow post (link so I dont need to re-type out the code). The fuzzytext library in R has the following available string methods c("osa", "lv", "dl", "hamming", "lcs", "qgram", "cosine", "jaccard", "jw", "soundex"). Our use case is matching (left-joining) basketball player names from 2 different sources. From the stackoverflow post, we have the following concerns to account for when string matching names: The left join shouldn't …
I am using an API which tracks metrics on parts of my site. The only useful bit it saves though is the URL (permalink). What is the most efficient way to query up all the posts that match those permalinks, given the fact that I don't have access to the ID's to use post__in with WP_Query.
I have been implementing a DecisionTreeRegressor model in Anaconda environment with a data set sourced from a 20 million row, 12-dimensional CSV file. I could get the chunks off of the data set with chunksize set to 500,000 rows and process the computation of the R-Squared score on the training/test split data sets in each iteration of 500,000 rows till iteration #20. sklearn.__version__: 0.19.0 pandas.__version__: 0.20.3 numpy.__version__: 1.13.1 The GridSearchCV() instance uses parameter grid with parameter max_depth set to values …
Suppose I have a set of examples $X = (x_1,x_2,..,x_n)$ with continuous numeric targets $Y = (y_1,y_2,..,y_n)$. While it is standard to use regression models to make point predictions of $y_i$ as $f(x_{i}) = \hat{y}_i$, I am interested in predicting a density function for $y_{i}$. What I want is analogous to the use of probabilities in classification instead of hard predictions (e.g. predict vs predict_proba in Scikit-learn), but for continuous regression problems. Specifically, a different density function (e.g. in the …
How can I train multivariate to multiclass sequence using LSTM in keras? I have 50000 sequences, each in the length of 100 timepoints. At every time point, I have 3 features (So the width is 3). I have 4 classes and I want to bulid a classifier to determine class for sequence. What is the best way to do so? I saw many guides for univariate sequence classification but none for multivariate, and I don't know how to apply this …
I want to do a grid search of some few hyperparameters through a XGBClassifier of a binary class, but whenever i run it the score value (roc_auc) is not being display. I read in other question that this can be related to some error in model training but i am not sure which one is in this case. My model training data X_train is a np.array of (X, 19) and my y_train is a numpy.ndarray of shape (X, ) which …
i want to compare 2 datasets and check for their similarity. I have tried statistical tests like ks test , z test but they gave a p value of 0.0 for most columns. I then read ks test won't work because the dataset size is huge and it will exaggerate even slight differences. Then I tried bhattacharya distance, helinger distance but the probability values are coming 0.01 (which is correct since it is continuous variable) . I am trying to …
SI is RMSE divided by the average value of the observed values (or the predicted values? am confused)? is SI = 25% acceptable? (is the model good enough? )
I need help how to translate the custom post type and taxonomy when using multisite and multi-language. I am using subdirectory a /en /sv etc. Are using the plugin (Multisite Language Switcher), but can not change the rewrite settings there. So I am guesing I have to change some rewrite? Or should I translate the post type with translations file, .mo .po? This is how the post type set up are in functions.php. Should I do something with the rewrite? …
I'm trying to determine if this is related to my having the latest version of PHP on my server while using the latest version of Wordpress. Or if I'm just doing it wrong: Here's my function that is correctly returning values (I can see them when I do an echo or a var dump): function my_Get_CURL (){ $url = 'http://someotherserver/event/94303?radius=30'; // Initiate curl $ch = curl_init(); // Disable SSL verification curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false); // Will return the response, if false …
In a neural network there are 4 gates: input, output, forget and a gate whose output performs element wise multiplication with the output of the input gate, which is added to the cell state (I don't know the name of this gate, but it's the one in the below picture with the output C_tilde). Why is the addition of the C_tilde gate required in the model? In order to allow the input gate to subtract from the cell state, we …
I have a WP_Query on my homepage template that works just fine for get_the_title() and MANY other functions for each post in the loop. However, any way of getting the comment count just grabs the real count for the first post and then repeats the same output for the rest of the posts. I've tried both echo get_comments_number($post->ID); and comments_number(); The kicker: if I use the same template NOT on the homepage, comment count works fine.
I was going through the Catboost paper section 4.1 where they talk about the 'Analysis of prediction shift' using an example consisting of 2 features which are bernoulli random variables. I am unable to wrap my head around the experimental setup. Since there are only 2 indicator features, so we can have only 4 data points, everything else will be duplication. They mention that for train data points the output of the first estimator of the boosting model is biased, …
So I've been trying to improve my Random Decision Tree model for the Titanic Challenge on Kaggle by introducing a Validation Dataset, and now I encounter this roadblock, as shown by the images below: Validation Dataset Test Dataset After inspecting these datasets using the .info function, I've found that the Validation Dataset contains 178 and 714 non-null floats, while the Test Dataset contains an assorted 178 and 419 non-null floats and integers. Further, the Datasets contain duplicate rows, which I …
i’m fine tuning the wav2vec-xlsr model. i’ve created a virtual env for that and i’ve installed cuda 11.0 and tensorflow-gpu==2.5.0 but it gives the following error : ValueError: Mixed precision training with AMP or APEX (--fp16 or --bf16) and half precision evaluation (--fp16_full_eval or --bf16_full_eval) can only be used on CUDA devices. i want to fine tune the model on GPU ANY HELP ?
Are there theoretical or empirical reasons for drawing initial weights of a multilayer perceptron from a Gaussian rather than from, say, a Cauchy distribution?
Hi everyone I'm newbie in Wordpress coding so please guide me as a newbie.. I want to show Permanent word counter on Page Create. any help would be really appreciated thanks
I have a doubt regarding terminology. When dealing with huggingface transformer models, I often read about "using pretrained models for classification" vs. "fine-tuning a pretrained model for classification." I fail to understand what the exact difference between these two is. As I understand, pretrained models by themselves cannot be used for classification, regression, or any relevant task, without attaching at least one more dense layer and one more output layer, and then training the model. In this case, we would …
I can't solve this issue. I tried everything I found over the web. I first tried configuring my nginx.conf following the example on codex with no success. https://codex.wordpress.org/Nginx I found that many users encounter this issue but the most popular fix is this: location / { try_files $uri $uri/ /index.php?$args; } I still get 404 for all pages. This only happens if I don't use the Plain setting in permalinks structure. Any ideas on how can I fix this? Thank …
Odds is the chance of an event occurring against the event not occurring. Likelihood is the probability of a set of parameters being supported by the data in hand. In logistic regression, we use log odds to convert a probability-based model to a likelihood-based model. In what way are odds & likelihood related? And can we call odds a type of conditional probability?
I am getting a weird problem with a client site. There is an existing WordPress working on it and has been working without any problem. I have deployed a mobile application that uses the WordPress API. Suppose my laptop and mobile app is connected to the same Internet Box. When using the mobile app, after I am unable to access the website even with my laptop. It seems as if the WordPress server or WordPress instance is banning my IP. …