Helpful Information
 
 
Category: vBulletin.org Forum
Usenet gateway

THIS THREAD IS CLOSED!
Posts in here will NOT be answered

This hack has undergone many major changes since this thread was started. Consequently, most of the posts have become dated and of little use. To coincide with the latest major release (20010712), a new one has been started.

See this thread (http://www.vbulletin.com/forum/showthread.php?s=&threadid=22599) for the latest version and discussion

thanks for this, i am so happy now :)

i am going to install and play and see if i can suggest etc things for it but probably wait to open it up to my members until the v2.0 version comes out.

Have to say this is one of the best features out there for vbulletin and bar co-branding and unlimited depth forums beats most of the additions for v2, thanks so much for making it :)

ok problem...

i installed some of the modules using cpan which was ok but one of them some how decided that it wanted to re-install the whole of perl without even asking. This, scared me quite a lot and i quit it which may not have been the best thing to do i don't know.

Anyway, after that little episode i think i did everything but i get this error message:


Net::NNTP: Bad hostname 'spamkiller.newsfeeds.com
' at newnews.pl line 219
Unable to connect to spamkiller.newsfeeds.com


now is this a problem with Net::NNTP (not being installed?) or the hostname (which is correct, it works in outlook express anyway). Any suggestions?

thanks..

spamkiller.newsfeeds.com is now known as spamkiller.usenet.com. They changed it around a month ago.

The Net::NNTP is part of libnet which may well have been already installed. If you're getting an error saying it can't fnd the host, it probably means the module is OK. It wouldn't have got that far otherwise.

newsfeeds.com appears to have come back, it is working. it was a silly mistake i think the server name went on to to line in the perl script and it didn't like it...

any way, currently going through the downloads so have it working... why did you have to release it now.. it is 4.30a.m here and i can't stop (i am only kidding just so pleased to play with this!)

one question, the "rejected bad header" what is that for? i get quite a few of these, probably 20% of them so far, is it spam or is it something else they are rejected for?

The bad header is usually due to strange character sets or a bad news client being used to post the article. Usually you'll see this in test and binary groups where people may be testing out there own posting programs. 20% seems high. I was getting that sort of number when I was using alt.test. But a real newsgroup only rejects a tiny percentage.

The script actually puts out the rejected header message when it can't find a message id in the header. This is usually due to the reasons above.

ok few things:

1) get these errors once it has finished going through the articles:


Use of uninitialized value at newnews.pl line 150.
Use of uninitialized value at newnews.pl line 150.
Use of uninitialized value at newnews.pl line 150.
Use of uninitialized value at newnews.pl line 150.
Use of uninitialized value at newnews.pl line 174.


2) Posting hangs

It appears the script hangs when trying to send posts back to the newsgroups. tried twice and nothing has happened for over 10 minutes now.

3) Numbers..

is this odd:

Processing group alt.tv.stargate-sg1...2321 messages.

results:

usenet_ref : 6098
usenet_article : 1028
thread : 204
post : 774

seems a bit weird to me

i am going to try a few more newsgroups but any ideas on why posting ain't working?

The uninitialized value messages are because the script is running using STRICT and with the -W switch. It's from all the regular expression testing in the article parsing bit. It basically means its trying to test an empty variable. It can be ignored and if you take of the -w switch it will disappear.

The numbers you show seem ok. The 2321 are the numbers the server reports it has. This is usually slightly high by maybe 25 to 100 messages. The usenet_ref table is the many to many link between articles and replies. Because you are doing the initial history pull you will find a lot of messages left in the usenet_article table because it contains replies to threads that have since expired from the server. Once history is collected you can set the $expire variable to a sensible value and it will gradually reduce as time goes on.

As for the posting problem, it works for me in alt.test. I'll keep looking looking and let you know if I find any problems. What message are you seeing before it hangs? Do you see the "Posting message by $poster to $newsgroup..." message?

The posting problem is an SQL statement problem. I removed some unwanted fields from the usenet_outgoing table but didn't change the statement. Unfortunately I was using select * instead of listing the fields.

In the post_outgoing routine (the last function in the script) change line 281 that reads my '$q2 = db_fetch(...'
to read:
my $q2 = db_fetch("SELECT poster,newsgroup,subject,refs,body FROM usenet_outgoing");
And change the line below it to:
while ( my ($poster,$newsgroup,$subject,$refs,$body) = $q2->fetchrow_array ) {

Sorry about that. The fix is in the package for download.

There's always something I mess up!

ps. There's also another little change you can make to showthread.php that will ensure the posts are ordered correctly everytime. Not essential but I've put it in the package.

[Edited by fastforward on 01-22-2001 at 12:39 AM]

thank you, i noticed when i posted a post to a thread it actually appeared 2nd to last as opposed to last so if it sorts that out, brilliant.

thanks, will apply fixes later today and see how it goes :)

ok couple of things:

I got the new download and got it working. It let me post ok but the post i did sent about 5 messages, some with the headers in which was a bit bizarre. I thought it might be corrpution in the database from old failed attempts to post so started again, and posting worked.

Now:

1) I am getting this a lot


Use of uninitialized value at newnews.pl line 152.
Use of uninitialized value at newnews.pl line 152.
Use of uninitialized value at newnews.pl line 152.
Use of uninitialized value at newnews.pl line 152.
Use of uninitialized value at newnews.pl line 152.
Use of uninitialized value at newnews.pl line 152.
Use of uninitialized value at newnews.pl line 152.
Use of uninitialized value at newnews.pl line 152.
Use of uninitialized value at newnews.pl line 152.
Use of uninitialized value at newnews.pl line 152.
Use of uninitialized value at newnews.pl line 152.
Use of uninitialized value at newnews.pl line 152.
Use of uninitialized value at newnews.pl line 152.
Use of uninitialized value at newnews.pl line 152.
Use of uninitialized value at newnews.pl line 152.
Use of uninitialized value at newnews.pl line 152.
Use of uninitialized value at newnews.pl line 152.
Use of uninitialized value at newnews.pl line 152.
Use of uninitialized value at newnews.pl line 152.
Use of uninitialized value at newnews.pl line 152.
Use of uninitialized value at newnews.pl line 152.
Use of uninitialized value at newnews.pl line 152.
Use of uninitialized value at newnews.pl line 152.
Use of uninitialized value at newnews.pl line 152.


after it finises a run of article download. Any ideas?

2) Worried about numbers of articles/posts coming through.

Can you explain what usnet_ref and usenet_article are for so i can understand. These are my stats

usnet_ref = 16520 entries
usnet_article = 3179 entries

threads = 724
posts = 1265

it just feels that i should have more threads and posts then i do.

Thanks,

I am going to continue playing.

p.s. i still get heavy heavy failed downloads form alt.tv.stargate-sg1 which is strange, it is a prefectly valid newsgroup. alt.tv.farscape however has very very few. I wonder why that is? perhaps if you have a chance you could have a play with that newsgroup.

--

Anyway, it is a pretty amazing hack i really love it just hope these few things can be ironed out :)

cheers

Chris, the numbers you are seeing are fine. The variable messages are just warnings. See my post earlier in this thread about both these issues. If you really want to hide those warnings remove the -w after the perl line at the top of the script.

I'll look into the excessive bad header thing for you.

regards

are thanks, i understand now. I missed that post of yours, only saw your second one and it answered my questions.

Love the hack, thanks for all the work.

sorry to be such a complete pain here.

Ok, i ran the script again (not on cron yet)

and it went through alt.tv.farscape:


Requesting article 58127 from alt.tv.farscape...Saved
Requesting article 58128 from alt.tv.farscape...Saved
Requesting article 58129 from alt.tv.farscape...Saved
Requesting article 58130 from alt.tv.farscape...Saved
Requesting article 58131 from alt.tv.farscape...Saved
Requesting article 58132 from alt.tv.farscape...Saved
Requesting article 58133 from alt.tv.farscape...Saved
Requesting article 58134 from alt.tv.farscape...Saved
Requesting article 58135 from alt.tv.farscape...Saved
Requesting article 58136 from alt.tv.farscape...Saved
Requesting article 58137 from alt.tv.farscape...Saved
Requesting article 58138 from alt.tv.farscape...Saved
Requesting article 58139 from alt.tv.farscape...Saved
Requesting article 58140 from alt.tv.farscape...Saved
Requesting article 58141 from alt.tv.farscape...Saved
Requesting article 58142 from alt.tv.farscape...Saved
Requesting article 58143 from alt.tv.farscape...Saved
Requesting article 58144 from alt.tv.farscape...Saved


but i go to the forum, and there are only two threads now marked new which seem to have anything else in them..

weird i thought. went into myphpadmin type thing and looked at the last records under usenet_article and searched for them (ie bits of their body text) and got no results.

So it does not appear to be a date/time thing (which would explain why there were not coming up the top marked new), it just appears they are not actually getting into the database.

On this point, i understand that usenet_ref is a many to many thing so there will be loads, but what about usenet_article. Should this not be similair to the number of posts? if so, that is my problem as my usenet_article is 3 times bigger than my number of posts.

Sorry about all these questions, bet you regret releasing this now :)

When articles are pulled down from usenet, their order within the thread is calculated and the article is placed in usenet_article. This table is just a temp holding area. At the same time the usenet_ref table is populated with the mesage ids of the articles it refers to.

The next step is to load any messages from the usenet_article table that has no refs (ie. an initial thread starter) into the thread and post tables. Then it deletes the article from the usenet_article and usenet_ref table.

Next it goes through the usenet_article table in conjuntion with the usenet_ref table and loads any messages that refer to an article already in the post table. It will only place it in to the post table if it is in order. For example, if post 1 and 2 are already in the post table and 4,5 and 6 are in the usenet_article table, the articles will NOT get loaded because article 3 is still not available. If you look at the 'ord' column, only posts at the same level or one higher will get loaded. Then it deletes the article from usenet_article and usenet_ref.

As an example for numbers you might expect; I mirror 6 newsgroups and have 2069 in thread, 4463 in post, 1288 messages still in usenet_article and 4463 in usenet_ref. My $expire is set to 14 days. This means any messages left in usenet_article that are over 14 days old will be deleted. Remember, the reason they are still in usenet_article is because the article they refer to is not available on the server or was rejected for spam or something.

The messages not showing up as new, happens due to the nature of usenet and the way messages are propogated. The modification made to showthread.php uses the 'ord' column to sort by first, then the dateline. This means makes the messages appear in the correct order. However, vBulletin uses date to determine whether the message is new. This means not all messages will show up new when they should. You will also run into problems with users not setting their clock correctly. Although the script handles time zones correctly, I often see articles with a future date because the users pc clock is wrong.

The first version I made of this script created a dummy date based on the order of the article. So each time a post was received the time was set to a few minutes after the article it refered to. This avoided the problem but meant you had no way of knowing when the post was really made. Maybe I should go back to this method. What do you think?

Addendum:
Since posting this, I just took another look at showthread.php. I believe it will be quite easy to fix the new post icon issue. I will have to create an additional dummy date field based on the actual date the article is placed into your forums. The real date will remain where it is now. I will modify showthread.php to use the dummy date field when doing anything with usenet forums but use the normal field for other forums. This will also eliminate the need for the original modification in that file. I'll try to make the fix this evening.

regards

[Edited by fastforward on 01-23-2001 at 10:48 AM]

thanks, that really help sort it out in my mind.

Couple of questions on this:

Say you have 1,2,3,4 of a thread which are in the thread and then 5 is rejected for spam, will 6, 7, 8 etc still go into the thread or will they never make it because 5 is missing?

The reason i ask is i know there are some threads that in the newgroup that only have a few posts in them on my forum, and they havn't been updated to it in the forum and wondering if this is perhaps why.



Another problem:

example

http://x67.deja.com/viewthread.xp?AN=718640869&search=thread&svcclass=dncurrent&ST=PS&CONTEXT=980265070.1705508975&HIT_CONTEXT=980265070.1705508975&HIT_NUM=21&recnum=%3c3tim6tkrqg3r4vm9jhjjmk5jjhq2l0ph5d@4ax.com%3e%231/1&group=alt.tv.stargate-sg1&frpage=viewthread.xp&back=clarinet

i hope that url works from deja.
It is the thread stargate DVDs in alt.tv.stargate-sg1
and is a full thread.

Ok, in my forum i have the thread:
http://www.ascifi.net/forums/showthread.php?threadid=656

but only the first two posts. And not really the first two either:

the first post is fine
the second post is only the quote of the first post. All the origina content is missing
None of the other 10 or so posts are there.

now in the newgroup, on the server spamkillers.newsfeeds.com it looks exactly the same as on deja, there definitely, to my mind seems something not working here because:

- it is a new thread, with no posts missing
- the newgroup has all the posts and threads in an identical fassion to deja.com's listing
- the thread is on the forum and one other post but that post appears corrupted.


--

on this point, i realise that the mod is not supported or anything and you have already spent loads of time trying to help me, i can get you access to my sql database for this forum if you want etc but completly understand if you don't have time not a problem....

---

Date sorting.

i think your old method of adding a few mins would be better, in my opinion the ordering of a thread is more important than knowing when the post was posted (especially when it might be wrong anyway). It would perhaps be possible to have another field for "date2" or something and somehow add that in somewhere at the bottom like

"This newsgroup reported this post was posted at blah" not that important really, i think ordering is more important.

and on the whole 1,2 not 3 then not 4 and 5.

perhaps it would be better to have 4 and 5 anyway listed in the thread if it is known 4 and 5 refer to 1 and 2 (i assume so???). then if and when 3 comes along it can be added in to the thread using the date system.

It might be possible to actually create a post saying:

"sorry, this post is unavaliable" in place of post 3. If post 3 then comes along then add it in.

A situation may happen when someone posts post 3 and then deletes it (i think you can do this can't you) or if it gets deleted by someone else for spam or any other reason and it is a shame for one mucked up post to prevent the whole thread being downloaded.

This assumes that post 4 and 5 refer to post 1, 2 and 3 but i think they do as this is the many to many thing you were talking about for usenet_ref i think so it should work.

I am learning this whole usenet system as we go along so sorry if making mistakes.

The only reason a message will not be put into the forum is if the thread starter does not exist.

You are right, there may be occasions when later posts refer to all levels of the conversation. In fact all messages will refer to the thread starter by default. This fact ensures messages are only discarded if the thread starter is not available.

Usenet is threaded. The only way to post to usenet is by responding to a particular article. You should think of a usenet 'thread' as a conversation that consists of many threads. Each article contains the message ids of all other articles within the thread. Each article within the conversation has an 'order' based on the level within the conversation and the order within a particular thread. All messages within any thread will contain the message id of the conversation starter as a minimum.

So a conversation can consist of:

1- Conversation starter = ord 0
2- Reply to 1 = ord 1
3- Reply to 2 = ord 2
4- Reply to 3 = ord 3
5- Reply to 3 = ord 3
6- Reply to 1 = ord 1
7- Reply to 6 = ord 2

Looking at the above, you will see that messages 1 thru 4 all belong in the same thread. They will each contain the message id of all articles before them. Message 5, however will not contain the message id of 4 as it does not refer to it. Likewise, message 6 will only contain the message id of number 1 in the header, it will know nothing about the other articles in the conversation. But what is important, is the fact that all messages will have at least the message id of the conversation starter in it's header. So, as long as the conversation starter exists, the messages WILL be put into your forum.

Originally posted by fastforward
, as long as the conversation starter exists, the messages WILL be put into your forum.

but, unfortunatly, they arn't.

have a look at my example from above, i think you may have missed my first post (like i did to you :) ) for an example of where this does not seem to be happening.

Yep I missed your post. I'd swear blind it wasn't there a minute ago! :o

As for the missing body in Erics message; that IS strange. Obviously something is going wrong with one of my parsing statements. Probably a dodgy regular expression. It seems the first line from all posts is being chopped off. Should be an easy fix. I'm sure that doesn't happen on mine though.

Can you confirm that the missing messages are in the usenet_article table? If they are then you can ignore my earlier post as it was obviously bollox! :( If they're not in there then they are being rejected for some other reason.

Either way I'll look into it.

regards

[Edited by fastforward on 01-23-2001 at 12:33 PM]

yup they are in there so... as you say bollox :) lol

at least i know i am not mad now :) i wonder what it is the problem.. strange really.

OK.. if they are there then the fix should be easy. The way it finds which articles to put in the forum at the moment is by using the following query:

SELECT a.forum, a.msgid, a.dtm, a.subject, a.poster, a.body, a.ord, b.threadid, c.ref FROM usenet_article AS a, thread AS b, usenet_ref AS c, post AS d WHERE a.msgid=c.msgid AND c.ref=b.msgid AND b.forumid=$group->{forumid} AND d.msgid=b.msgid AND ((d.ord + 1 = a.ord) OR (d.ord=a.ord))

This must flawed in some way. I was going to do it by looping through the records but I thought I'd be clever by doing it in one go with this query. I'll look into it and find another way of picking the articles. In the meantime, leave the missing articles where they are. When I give you the fix it should just pick them up and move them to the forum.

One last thing, can you look at the 'ord' and 'refs' column in the usenet_article table for the missing posts. Make sure that something is listed in the 'refs' column and let me know what the 'ord' number is. Then do the same for the 2 messages that in the posts table and check the 'ord' column.

ok for message three from

http://x67.deja.com/viewthread.xp?AN=718640869&search=thread&svcclass=dncurrent&ST=PS&CONTEXT=980265070.1705508975&HIT_CONTEXT=980265070.1705508975&HIT_NUM=21&recnum=%3c3tim6tkrqg3r4vm9jhjjmk5jjhq2l0ph5d@4ax.com%3e%231/1&group=alt.tv.stargate-sg1&frpage=viewthread.xp&back=clarinet

now refs are:


<3tim6tkrqg3r4vm9jhjjmk5jjhq2l0ph5d@4ax.com> <nTUa6.16025$T5.1752524@typhoon.midsouth.rr.com>


so there is something there

and the order is:

2.

this is perhaps weird as article 3 is

Article 1
---> Article 2
--------> Article 3

should it not be order "3" i wonder?


I have sent you acccess to the sql database to webmaster @ yourdomain.com as you turned email of here, i can send it to another email if you would like

2 is correct. The first message is number 0. This indicates the problem definately lies with the query I posted earlier. All the ord column is really is a count of how many references are in the refs column. I'll start fixing it now.

Replace line 162 of newnews.pl that reads:

$qry = db_fetch("SELECT a.forum, a.msgid, a.dtm, a.subject, a.poster, a.body, a.ord, b.threadid, c.ref FROM usenet_article AS a, thread AS b, usenet_ref AS c, post AS d WHERE a.msgid=c.msgid AND c.ref=b.msgid AND b.forumid=$group->{forumid} AND d.msgid=b.msgid AND ((d.ord + 1 = a.ord) OR (d.ord=a.ord) OR a.ord = 1)");


with this:

$qry = db_fetch("SELECT a.forum, a.msgid, a.dtm, a.subject, a.poster, a.body, a.ord, b.threadid, c.ref FROM usenet_article AS a, thread AS b, usenet_ref AS c, post AS d where b.threadid = d.threadid and b.forumid = $group->{forumid} and c.ref = d.msgid and a.msgid = c.msgid AND ((d.ord + 1 = a.ord) OR (d.ord=a.ord))");


I'm still not sure why you are getting the first line chopped off some of messages. I don't seem to have the problem I'll keep looking.

The above fix is now in the package for download.

You'll need to run the newnews.pl half a dozen times or so to catch up and load all the missing posts.

[Edited by fastforward on 01-23-2001 at 03:36 PM]

wehhayyy massive improvement.

old stats:

usnet_article = 3179 entries
posts = 1265

new stats:

usnet_article = 2700 entries
posts = 5948

what is strange is that i have so many more posts but not that many less usenet articles? strange still me thinks, shouldn't it be a 1 for 1 swap?


There are also some still not in there, i will have to explore why and it is perhaps the missing first article thing we talked about.

I will also explore a bit more that missing first line thing and see if it happens again.

Thanks for fixing this, going to get a few of my users test using it now to see what they think.

Hmmm... That does seem very odd. You definately should have a one to one increase/decrease in those tables. Unless of course you just downloaded a whole bunch of other articles that don't have anything to refer to. But that seems unlikely. What is the setting for your $expire variable? Now that you have pulled all the history, you should have it set to 7-14 days. Remember the number needs to be in seconds though.

Do you have to have "allow guest to post" as "on" to be able to use this hack.

ok my new plan is to start again,i will delete posts, threads etc and see how it goes.

dunefreak,

No you don't need 'Guest posting enabled'. The script simply forces the usenet posts into the tables in a way that makes vBulletin think it was a guest that created the post. All normal vBulletin permissions and options work on the usenet forums.

ok slight problem:

Someone posted on the newsgroup, i think 2 messages (maybe 3 as their post count is 3 but only two messages are found when searching by username.. bizarre)

anyway..

in one thread this were the results (3 posted messages)


Thanks, Mandy. That was great. Loved the Teal'c and his staff weapon and Major Davis and his chocolate bath... but bring back MARTOUF!!



One question though...why wasn't General Hammond getting devested? Is he too old or...

Sent via AsciFi.com - the number 1 science fiction community
Visit us today --> http://www.ASciFi.com/?source=news


this is the correct post in the correct thread. this is when we have problems. Next posts:


Thanks, Mandy. That was great. Loved the Teal'c and his staff weapon and Major Davis and his chocolate bath... but bring back MARTOUF!!



One question though...why wasn't General Hammond getting devested? Is he too old or...

Sent via AsciFi.com - the number 1 science fiction community
Visit us today --> http://www.ASciFi.com/?source=news
Subject: The Dannytoes Dream Episode, Pt 2
From: jsc
Sender: jsc
Newsgroups: alt.tv.stargate-sg1
Content-Type: text/plain; charset=ISO-8859-1:
Content-Transfer-Encoding: 8bit
User-Agent: AsciFi.com Forums
References: 2jsc980271399 <20010122100836.02009.00001018@nso-ma.aol.com> <94hu5k$q8p$2@news6.svr.pol.co.uk> <94kov7$bqi$1@neptunium.btinternet.com>
Organization: AsciFi.com (http://www.ascifi.com.com)

This is a great story. Love the Major Davis and Dr. Frasier interlude! Telll the T-man to get himself to my home and I'ss show him some acorns and strong oaks!

Sent via AsciFi.com - the number 1 science fiction community
Visit us today --> http://www.ASciFi.com/?source=news
Subject: Dream Episode, Part 3
From: jsc
Sender: jsc
Newsgroups: alt.tv.stargate-sg1
Content-Type: text/plain; charset=ISO-8859-1:
Content-Transfer-Encoding: 8bit
User-Agent: AsciFi.com Forums
References: 2jsc980271609 <20010123153937.04061.00000750@nso-mh.aol.com>
Organization: AsciFi.com (http://www.ascifi.com.com)

Mandi this is fun to read! Please keep it up!



And tell T-Man to do the same!

Sent via AsciFi.com - the number 1 science fiction community
Visit us today --> http://www.ASciFi.com/?source=news


and next one


Thanks, Mandy. That was great. Loved the Teal'c and his staff weapon and Major Davis and his chocolate bath... but bring back MARTOUF!!



One question though...why wasn't General Hammond getting devested? Is he too old or...

Sent via AsciFi.com - the number 1 science fiction community
Visit us today --> http://www.ASciFi.com/?source=news
Subject: The Dannytoes Dream Episode, Pt 2
From: jsc
Sender: jsc
Newsgroups: alt.tv.stargate-sg1
Content-Type: text/plain; charset=ISO-8859-1:
Content-Transfer-Encoding: 8bit
User-Agent: AsciFi.com Forums
References: 2jsc980271399 <20010122100836.02009.00001018@nso-ma.aol.com> <94hu5k$q8p$2@news6.svr.pol.co.uk> <94kov7$bqi$1@neptunium.btinternet.com>
Organization: AsciFi.com (http://www.ascifi.com.com)

This is a great story. Love the Major Davis and Dr. Frasier interlude! Telll the T-man to get himself to my home and I'ss show him some acorns and strong oaks!

Sent via AsciFi.com - the number 1 science fiction community
Visit us today --> http://www.ASciFi.com/?source=news


i have no idea what is going on here, it appears that messages are being mixed together, and then all posted to the same thread (the thread where the second post was meant to go never went there... strange)...


One final thing this brought up, if someone deletes their own posts it should be deleted from the awaiting to send to usenet table if it has not yet gone.

any ideas what is going on here? It only occurs when more than 1 message goes from the awaiting to send at table at once.

reading through again this is what appears to have happened.

first post --> sent to correct thread correctly.

2nd post combined with 1st post (including headers) --> sent to forum of first thread.

3rd post combined with 1st post and 2nd post (including headers) --> sent to forum of first thread.

not god :)

Ooops! It seems the array that holds the messages for sending is not cleared before it starts reading the next one. I must admit, I never tested it with more than one message in the outgoing table. Here's the fix:

On line 279 of newnews.pl, just after the line that reads

$c->post(@article);


Add the following line:

@aticle = ();


This is an extra line, NOT a replacement.

Sorry about that.

thanks a lot :)

now assuming the next run works and i don't find any more problems in the actual collection of the posts (let's hope) a few things/improvements for v2.0 (have a rest first after all the help though!)

1. Delete post from table to be sent if user deletes it. This may be because they post in wrong forum etc.

2. Using BPCode i would think to change the colour of quoted stuff
>stuff
>here
>for example
see. Not quite sure how this can be done really. Perhaps adding the bpcode on import would be best???

3. I am going to seperate posts in newsgroups to post on forums in user account, this won't be hard i doubt and will be part of vbulletin as opposed to this hack.

4. using more than one server, i know of some newsgroups i would like to mirror that are only on 1 specific server, would need a choice for these.

5. auto pruning of a guest's user posts and auto deletion of any of these posts upon upload (referenced by a username in a certain newsgroup or by an email address etc.)

6. Hard one this is, email notification of a new newsgroup post to a thread that a member of the forum has posted in.

7. Can variables be passed into the signature i would like a sig like


Posted from ASciFi.com - the number 1 science fiction community.
View this thread on the internet at http://www.blah.com/blah?blah=blah

type thing. Would be quite cool.

anyway that is just a few possibile things i can think of, more i am sure will come :) perhaps add a few of these in (email notification most useful probably) when you do the change for v2.0 (hope that is not too hard, shouldn't be i would not think).

Cheers again for all your help.

chris

What an original and most excellent hack, I look forward to experimenting with it when V2.0 comes out.

Thanks

i now get the following error message referring to the new line


Global symbol "@aticle" requires explicit package name at newnews.pl line 296.
Execution of newnews.pl aborted due to compilation errors.

ok to get it to work just comment out the use strict.

It now works but i get this error message:


Name "main::aticle" used only once: possible typo at newnews.pl line 296.


i doubt this matters though.

Chris,

It looks like you may have forgotten to use the '@' symbol in front 'article' on that new line you added.

Every mention of 'article' in the sub-routine post_outgoing() should be prefixed with @

What exactly do you have on line 296.

The post_outgoing() routine should be this:


sub post_outgoing {
print "Processing outgoing messages\n";
my @article=();
my $q2 = db_fetch("SELECT poster,newsgroup,subject,refs,body FROM usenet_outgoing");
while ( my ($poster,$newsgroup,$subject,$refs,$body) = $q2->fetchrow_array ) {
push(@article,"Subject: $subject\n");
push(@article,"From: $poster\n");
push(@article,"Sender: $poster\n");
push(@article,"Newsgroups: $newsgroup\n");
push(@article,"Content-Type: text/plain; charset=ISO-8859-1:\n");
push(@article,"Content-Transfer-Encoding: 8bit\n");
push(@article,"User-Agent: $useragent\n");
if ($refs) { push(@article,"References: $refs\n"); }
push(@article,"Organization: $organization\n");
autoformat $body, {right=>72};
push(@article,"\n$body\n\n$footer\n");
print "Posting message by $poster to $newsgroup...\n";
$c->post(@article);
@article = ();
}
db_execute("DELETE FROM usenet_outgoing");
}

even more silly, you made a type and i copied and paste:


@aticle = ();


oops it should be article :)

i should have noticed that :)

I just received an email from hostro telling me that I was spamming newsgroups. I know this was not from my account. I have accounted for all posts. All I can imagine is somebody downloaded the script and didn't change the $footer variable (which had my site in it). Please... if you have the script and didn't change it, change it NOW.

fastforward, i would recommend a search at deja to try and find out which newsgroups it was in, we may be able to narrow it down for you.

ok it only came up with 4:

http://www.deja.com/dnquery.xp?QRY=britishexpats.com&ST=MS&svcclass=dncurrent&DBS=2

they must have deleted the rest?

Thanks Chris

Those are all good posts. Well the non-test ones are anyway. There was also a bunch more I posted to 01alt.test, but they should be ok. I mean that's what test groups are for right? Surely they couldn't be construed as spam.

i seriously doubt it, what did hostpro actually tell you? did they say which newsgroup or what? if they didn't then ask them, it could very well be some competing website that just sent them an email saying you where. I bet they didn't even check it themselves.

Unfortunatly because we all hate spam so much some people are too quick to blame people who do nothing wrong. It dosen't appear anyone else would have done it i think most are waiting for vb 2.0 before they play with this so i would think it is just someone being a pain.

p.s. you using http://www.newsfeeds.com - not happy about them, their article numbers on the front are their goliath server, huge huge difference to their other spamkillers server which has massivly less posts in it. Probably why we end up with so many in the temp table when really there should not be that many (how many news threads go on for more than a month say and they should have a months worth of posts). Have to cancel my account to re-sign up on the goliath server. akk...

Yeah, I did ask hostpro. But the girl who sent the email has left for the weekend. The only thing in their notes was something about spamming newsgroups. No details. I just realized though; the organization field also contains my site address. There's no way to search for that though. I guess I'll just have to wait until Monday.

As for the spamkiller server problem. I did try goliath, but it was way too slow. I even tried the one reserved for platinum account holders. Spamkiller was by far the fastest.

Anyway, thanks for your help.

By the way, I've made a lot of changes to the script. It now italicizes quotes from usenet messages (we'll have to wait for v2 for colours. I also fixed a threading problem when a local reply is made to a locally started usenet thread. The problem was the message id does not exist until the message reaches usenet. I added some logic to update the local message with the message id on the next pull. Unfortunately I had to add some more fields to the tables and also renamed a few fields. You'd have to reinstall and re-pull history if you wanted it. Or you can wait for the version 2 release. The code is a lot tidier now. Let me know if you want a copy.

regards

sounds very cool, i will wait until v2.0 to try again, i was really just trying it out and playing around with it first and it is going to be really great feature for my site i know!

on the whole newserver thing, they have a new on called goliath2 which they say is a lot faster, it also has more files in it as well, did you every try that one? i am uite keen on getting as many threads to start with as possible.

otherwise i wonder what other companies are good. One thing i like about this one is that they don't actually have your credit card details but use that external comapny. This means if they do try and do their trumped up $50/hour spam removal charges if one of our users does something silly or whatever then they can't actually charge me, something i am a bit worried about.

anyway, looking forward to the v2.0 release of this :)

I really want to try this on my board. but I need to ask a couple things firsrt.

Where exactly would the newfeeed come from, would I have to pay for this or can I get it for free.

When installing the cpan modules, could I damage my server at all??

And lastly, the d/l link wont work, can someone email it to me? Cheers :) <G>

newfeeds.com you pay for. You can try to find free ones but good luck in your search :) (search for free usenet at dmoz.com for sites on it).

downloading cpan modules, i doubt it but don't know. I have never had any problem before however with these i think it tried to install perl again somewhere, i cancelled it and nothing appeared to go wrong although i probably have a few pointless files somewhere. That probably does not inspire you with confidance but i have installed quite a few modules this way and never ever had a problem and this was probably me just not understand what was going on so i wouldn't worry. Otherwise just use the modules in the file.

Can't send it to you, it has probably been taken down to sort the headers and things out.

I've never had problems with CPAN. The attempted re-install of Perl that chris is talking about was probably while installing netlib. If it finds it is already installed it will ask a whole bunch of questions regaring smtp servers and host names etc. Most of the modules will most likely be already installed if it is a managed server. The modules already on HostPros vservers were

DBI
DBD:MySQL
Date::Parse (part of the DateTime package)
Net::NNTP (part of the netlib package)
Net::SMTP (part of the netlib package)

So I only had to install one module:
Text::Autoformat


By the way. The download link is broken while I had email notification for replies and italicized quotes. There's a few other fixes too. I'm modularizing it so it will be easy to just drop in the section for vb2 without redownloading and reloading all the posts.

regards

fastfoward,

any plans for the email notication idea:

basically, someone in usenet starts a thread, member a comes along to forums and posts, a usenet member then posts a reply which gets uploaded to the forum. I then want it to notify agent a.

how?

well, i guess it could work like.

while it moves posts from the temp to the actual post section, it does a search for all the members who posted in that thread who have email notification on and sends them the email.

The only problem i can see with that approach is if you get two posts from usenet to the same thread, how to stop it sending two emails.

perhaps you have a better idea?

I'm adding that feature right now. It will work exactly as you suggest. It will only send an email once for every usenet batch pull. If you get 3 messages in the thread during a batch, only one mail will be sent to each user. Then if the next batch contains a message, it will send another email.

I'm trying to figure out how to use the existing vb template for email from within the perl script at the moment.

I'm also adding a feature to allow vb style quotes for usenet messages. It will mean you can choose italics or the <QUOTE> method. The choice will really depend on the newsgroup and whether the majority of posters use sensible news programs that apply quotations correctly.

regards

i would imagine this is something that is going to change a bit with vb2 with the whole vbhome type thing so hope not too hard to change over, glad you are implementing this though :) cool.

Originally posted by fastforward

I'm also adding a feature to allow vb style quotes for usenet messages. It will mean you can choose italics or the <QUOTE> method. The choice will really depend on the newsgroup and whether the majority of posters use sensible news programs that apply quotations correctly.

regards

i don't think the whole (quote) and (/quote) thing will work too well, the problem is often that the lines of text get cut off so you get something like

> sdgfasdfgksdgfjdsgkdskgfjsdgjdkg nmfg dsfn mcnv bmcxvn bmxcvnb
df
> sdgfdsfgsdkjgfdksgf kdjfg kdfg dfn gmdfng dng mdfng mdfgn
dsdf
> skf skdg dmfg dkg kdfg dfgdf,gnsdfmgdsfgmsdgf
>

which really does not look good.

I just wonder if there is any clever logic that could be used more than the > line to take into these situations into account? i will have a think about it.

--

it is also the one time that i wish that vb allowed threaded listings like wwwthreads :)

oh the other thing was being able to use more than one usenet server.

The way i could see this being done is having a table containing usenet information,

ID
server address
username
password

etc.

and then for each newsgroup you have an extra column for server id. something like that.

if possible i would really like it, there is this newsgroup i really want to include but it is only avaliable on one server.

any help on where I can obtain a news feed from, cheap as possible please :)

My favourites are:

OnlyNews (http://www.onlynews.com/) for $12.95 with as they don't throttle download speeds.

or

Newfeeds (http://newsfeeds.com/) as they have a whole bunch of different servers to choose from.

but

If you're talking about a traditional IHave feed for your own news server then you are looking at a minimum of $200/mth.

Due to popular request... well 2 requests, I have cleaned up and released a new version of the usenet gateway script.

Some of the new features are:

email notification to usenet replies
emoticon translation into vb icons
hyperlinked urls in messages
italicized quotes
support for xrover
logging of outgoing posts

In addition, the code has been completely re-written and is much cleaner (at least I think so).

Some things planned for the version 2 release

support for multiple news servers
canceling of messages after they have been sent to usenet
treating usenet posters as special members so that posts are counted and can be searched by username
control panel integration

The url for the latest version is usenet_gateway_v115.tar.gz (http://britishexpats.com/download/usenet_gateway_v115.tar.gz)

regards
PAJ

Will this usenet gateway be included in vb2.0?

No, but this hack is being ported over.

cool fastforward, it is looking really great. I am looking forward to the vb2.0 version ;)

This is a very cool hack.

Though I wonder how full the database might get with a busy newsgroup.

Have you tried it with a busy one?

And, how does it handle any binary attachments?

I think it doesn't support MIME. It would be cool tho :)

This is a very cool hack.

Though I wonder how full the database might get with a busy newsgroup.

Have you tried it with a busy one?

And, how does it handle any binary attachments?

Binary attachments are not supported. Although it would not be difficult to add a feature that saves the binary attachment to a directory and post a link in the forum.

There is a variable called $max_msg_length that can be set to the max number of lines that a message can be before it is assumed the post is a binary one. The default setting is 300 lines. In addition there is a spam filter to check for 'begin 644' in the body and reject the post if it is found (begin 644 means the post is binary).

As for busy groups. It all depends on your news server speed, your connection speed and the amount of space you have available for storage. But for practical use where you would maybe have 10-20 newsgroups, there will be no problem at all.

There was bug in the expire_article routine that was causing articles never to be removed from the usenet_ref and usenet_article tables. The fix is only in the newnews.pl script and affects only one line.

The link in the first post of this will always point to the latest release.

Hi...me again. I ran the SQL changes, no problems there (although the spam table isn't showing up on the database...any idea why? Is this required?)

Anyways, made the modifications to the newnews.pl file, uploaded to the cgi-bin directory (I don't believe my web host will allow to execute scripts outside of CGI-BIN), chmodded it to 755. Now I'm getting an Internal server error when I run the script from my browser.

I updated the first line of the script to my local path to perl (I know this works because I had UBB up and running before vbulletin, and I just copied that from the cgi script that was working). The only question in my mind is the following line in newnews.pl:

#use lib "/[path.to.your.local.perl.directory]/perl5/site_perl/5.005";

What is the proper syntax for this line? Could this be causing the internal server 500 error I'm seeing? My path to perl is #!/usr/bin/perl

Another question...why are there two newnews.pl files in the distribution? Which should I be using? The one in the Rcs folder looks different than the other one (header information) so I assumed that wasn't the correct one.

Thanks in advance!

The usenet_spam_filter table IS required. The script will fail if this is not here. It may be the cause of your error. If this is not being created, there may be an error in my SQL. Try running just the part that builds this table.

You don't need the '#use lib' line as your ISP has installed the modules for you.

The shebang line is probably correct. If you had ubb running using that line then it definately is.

The RCS sub directory is for revision control. You can ignore it. I just included it for completeness.

One point though, I did not intend this script to be executed via the browser and I have no idea whether it will work or what implications it may have. I will do some testing over the next few days and let you know the outcome.

regards

Not sure why the spam table wasn't added, but I ran just that part of your script and it did add the table to the database.

I'm kinda baffled at this one...I changed the newnews.pl again, re-uploaded it to my cgi-bin directory, Telnet'ed into my account, and I can't find the file there!!!! It shows on my FTP program, and I can even hit it with my browser (although I get that same 500 error), but looking for it through the telnet shell, I can't find it! I want to execute it from there so I can see if/what the error is, but can't find it.

Yes, I did CHMOD it to 755, even tried 777 and it just doesn't show up. Any thoughts?!?

stupid telnet program...found the program, so I've now run newnews.pl via telnet. Connects to the server and the right newsgroup, grabs the first 100 messages, when it does to dump them to the database, I'm getting the following error:

Requested 100 messages... 0 not available or rejected.
DBD::mysql::db do failed: You have an error in your SQL syntax near ''No User',9
79507486,'ALISHA has done the following\n____________________________' at line 1
at ./newnews.pl line 419.
Query failed (INSERT IGNORE INTO post (allowsmilie,threadid,username,dateline,pa
getext,visible,msgid,ord) VALUES ('1',,'No User',979507486,'ALISHA has done the
following\n________________________________________\nSlept: 7 Touched: 31750 Dro
pped: 37\n________________________________________\nNetbios Rulez \\GUYC \\C \\E
SC400 \\C \\ACI \\BOB PUBLIC \\C GATEWAY \\CUSTOMERS \\DOWNLOADS \\MY DOCUMENTS\
n\\MYLABEL \\PROGRAM FILE \\COMPAQ5695 \\MY MUSIC \\MYDOCS \\C \\PAVILLION A \\P
AVILLION C \\backup \\D \\C \\D\n\\E \\C_PAZR1 \\HOME C DRIVE \\TEMP \\C \\C \\C
\\C \\C \\D \\Downloads \\SHARED \\HHC \\HHPROGRAMS \\HHSHARED\n\n..and the wor
ms ate into his brain\n','1','<0e8cc39d915d766e63245db40b0119cb@anon.xg.nu>','0'
)) at ./newnews.pl line 419.

Any ideas here? I'm at a total loss here. It looks like it's having a problem dumping the messages into the post table, but I'm not an SQL expert).

On another note, a couple of things you might want to consider when you do your next documentation update:

1.) The SQL changes (textfile) - There are two usenet_outgoin_log table create commands

2.) You are using the table POST and post interchangably, which is causing errors in the alter table section toward the bottom of the query. Took me awhile to figure that mysql is case sensitive.

3.) You might want to make some recommendations about a mysql client. I'm using phpMyAdmin, and it's great. I couldn't imagine trying to do this stuff manually.

That's it for now! Any suggestions on the above?

The problem seems to be that 'null' is being inserted into the threadid column. As far as I can see, there are two possible reasons for this error:

Number 1 is line 292 that reads

$threadid = $dbh->{'mysql_insertid'};

is failing for some reason. mysql_insertid is a built in mysql function and returns the id of the last inserted record. I have no idea why this would not work on your setup. What version of Perl and MySQL are you using?

Number 2 is if the query on line 306 was pulling back null for the threadid column from the thread table. Now this is virtually impossible as threadid is an auto_increment column with a unique index on it.

I don't want to ask the obvious, but I will :) you are installing this on version 1.15 or 1.114 right?

I'll need to do some more investigation. I'll get back to you.

As for the SQL instructions. I'm an Oracle DBA so I'm used to tablenames only being case sensitive if you put double quotes around them. :) I will correct them asap.

regards

Not a dumb question at all. I am running:

vb 1.1.5
mysql 3.23.32
perl version, well, as you know, hard to determine, but if I issue the command $perl -MDate::Calc -e 'print "$Date::Calc::VERSION \n"' the version returned is 4.2.

Anything else? I've tried a bunch of different stuff, manually wiped out the records in the usenet_article and usenet_ref tables, reinstalled everything (including vb), with no avail.

Thank you, looking very forward to getting this up and running, and your support is better than most technical support companies!!!!!!!!!!!!!!!!!

-RC

OK... looks like you have uncovered a little buggette. I think the problem is this.

When a usenet article is received that will become the thread starter, an entry is made in the thread table. The statement that does this is on line 264 and uses the INSERT IGNORE clause to prevent any possibility of duplicate messages if your news server decides to reindex and change message numbers on you.

However, the side effect to this is should the statement ever actually be ignored due to a duplicate messageid, the next line that gets the mysql_insertid will return null as nothing was actually inserted. So consequently, when it tries to put the corresponding entry into the post table on line 266, it will fail because threadid in the post table is specified as NOT NULL.

Now this is all very well and is easy to fix, but the real issue is why you came across a duplicate. The message id column is supposedly unique across all of usenet. You mentioned you tried again by emptying out the usenet_article table first. If so, the only way a duplicate could occur would be if the article came twice with the same messageid from your news server. But even this shouldn't matter as the usenet_article table has a unique index on the 'msgid' column.

I'll fix the script and post it in about an hour. In the meantime, check the following:

make sure the only unique keys on the thread table are 'threadid' and 'msgid'
make sure the usenet_article table has a unique index on 'msgid'

regards
PAJ

Well, glad (and sorry) to hear it doesn't look like it was me. Sorry to put you through this, but I guess that's the nature of the beast, huh? ;)

Unique keys exist as you state in your message above. Really wierd I agree. At any rate, look forward to your reply.

Thanks again!


---------------------------------
Captnroger
Forum Moderator
Gofishohio.com
www.gofishohio.com

The fix has been incorporated into the script. The newnews.pl and the sql instructions are affected.

The change simply checks whether the $threadid is null prior to inserting into the post table.

If this doesn't work for you captnroger we'll need to review the entire setup.

If your problem was actually caused by a faulty mysql_id function you will not see any messages in your forums.

PAJ
--
the latest version of this script is always in the first post of this thread.

Here's the error I'm getting now (I did run the SQL query without an error though!) Modified the new newnews.pl, ran it, here's the message:

DBD::mysql::st execute failed: You have an error in your SQL syntax near '' at l
ine 1 at ./newnews.pl line 424.
Query failed (SELECT title FROM thread WHERE threadid=) at ./newnews.pl line 42
4.



Sorry to be a pain. Any thoughts?

OK.. assuming the problem is due to duplicate messageids the problem was my fault. It was trying to index that non-existent thread. I've added another check to stop it happening.

I still may have missed some so let me know if it fails anywhere else.

regards

FF -

First of all, I commend your efforts today to help me get this working. You truely are a champ.

The latest mods to the script got me through the importing process, YEAH! The script executed without an error.

What DID happen though is that while my forum has all the headers, along with the author, date and time, if you click on the actual post, there is nothing there. In looking at the database, nothing is actually being posted to the post table. I have three test messages in there that I created, that's it. The newsgroup messages don't make it to the post table. Looks like they are still in usenet_article.

So close!!!!!!!!!!

That indicates that something really is wrong with the 'mysql_insertid' function call or for some reason it's returning null.

Remove the IGNORE word from the SQL in line 303 so that it reads:

db_execute ("INSERT INTO thread (title,lastpost,forumid,postusername,lastposter,dateline,msgid,visible,open) VALUES ($subject,$dtm,$forumid,$poster,$poster,$dtm,$msgid,'1','1')");

and add the following debug line right after line 304 that reads $threadid = $dbh->{'mysql_insertid'};

print "DEBUG: threadid for msgid $msgid is $threadid\n";


Now if the SQL doesn't fail we know it is the 'mysql_insertid' function at fault. If the SQL fails then show me the error message as you did before.

The other thought I have is maybe your news server doesn't fully support the 'xhdr' function. This is unlikely but it would cause any header it couldn't find to be null.

What provider are you using? To check if this the problem I can give you temporary access to a test DNEWS server I have at home. It will be slow though as I only have 256K upload speed. Let me know your IP and I'll add access for testing.

regards.

hmmm....interesting. Never considered it may be the newsfeed. I'm using newsfeed.com, although I am using their spam-free newsfeed. Let me play around with this a bit...I'll get back with you.

okay, just to be safe I wiped out the database and started over again, and also changed news feeds. Here is the debug of the session as per your request:

Connecting to News.newsfeeds.com... Connected
Sending authentication info... Authenticated and logged in
Getting article batch from alt.test
Fetching headers of articles 848576 to 848675... done
Fetching article body 848576... OK
Fetching article body 848577... OK

much..
deleted..
here..

Processing article batch...
Requested 100 messages... 14 not available or rejected.

DEBUG: threadid for msgid '<20010207001313.30401.qmail@nym.alias.net>' is
DEBUG: threadid for msgid '<7daeb90ada844104ea268ca9b8bc25dd@mix2.hyperreal.pl>'
is
DEBUG: threadid for msgid '<mm0g6.140$Co5.19947@nntp2.onemain.com>' is
DEBUG: threadid for msgid '<8270e04f916966896207f88be635ef2b@anon.xg.nu>' is
DEBUG: threadid for msgid '<20010207000811.27948.qmail@nym.alias.net>' is
DEBUG: threadid for msgid '<Ek0g6.225629$w35.39920301@news1.rdc1.nj.home.com>' i
s
DEBUG: threadid for msgid '<Nk0g6.1085$D3.4198@tor-nn1.netcom.ca>' is
DEBUG: threadid for msgid '<ih0g6.129$0W5.38111@nntp3.onemain.com>' is
DEBUG: threadid for msgid '<Cc0g6.124$0W5.35089@nntp3.onemain.com>' is
DEBUG: threadid for msgid '<95q1pn$4id$1@lacerta.tiscalinet.it>' is
DEBUG: threadid for msgid '<JW%f6.118$0W5.21387@nntp3.onemain.com>' is
DEBUG: threadid for msgid '<S_%f6.127$Co5.9928@nntp2.onemain.com>' is
DEBUG: threadid for msgid '<A30g6.129$Co5.13100@nntp2.onemain.com>' is
DEBUG: threadid for msgid '<95q2nk$hv28s$1@ID-55073.news.dfncis.de>' is
DEBUG: threadid for msgid '<d80g6.131$Co5.13778@nntp2.onemain.com>' is
DEBUG: threadid for msgid '<3a808f06.949583@news.attcanada.ca>' is
DEBUG: threadid for msgid '<uR%f6.122$Co5.7773@nntp2.onemain.com>' is
DEBUG: threadid for msgid '<nM%f6.111$0W5.13742@nntp3.onemain.com>' is


Lots of these.... ends with...
DEBUG: threadid for msgid '<qk2g6.929$0W5.133222@nntp3.onemain.com>' is
DEBUG: threadid for msgid '<3p2g6.932$0W5.135574@nntp3.onemain.com>' is
DEBUG: threadid for msgid '<Bv2g6.395$Co5.72310@nntp2.onemain.com>' is
Processing outgoing messages


Does this help? I doubt it's the feed (or maybe it is, I dunno) because I'm using newsfeeds.com.

Yeah newsfeeds.com definately works. I tested it with that.

Well as you can see, the messages are being inserted into the thread table correctly but the threadid is not being returned by the mysql_insertid function. I'm not sure why this is happening. Maybe it's something to do with your mysql setup, I have no idea. But if it happened to you, it will probably happen for others.

I have added another function in there that checks if the threadid is null and if it is, look it up again using the message id. So it will try to use the mysl_insert_id first and if that fails, it will find it itself.

Try that and see what happens. In the meantime I'm going to see what information I can find out about this function at mysql.com.

Hey, it looks like THAT WORKED!!!!!

Can ya believe it?!?!??!

Kind of a weird thing happened on the first import...the forums showed 'xx threads and xx posts', but if you clicked on the forum, the display was empty. Running the script the second time seemed to work!!!!!!!!!!

I'm going to play around a bit more just to make sure, but thank you!!!!! I'll let you know how it goes tomorrow.

Here was another one this morning...I've added a total of 8 newsgroups. The first few are processed fine, but this one keeps bombing out:

Fetching article body 48392... OK
Processing article batch...
Requested 44 messages... 0 not available or rejected.

DBD::mysql::db do failed: You have an error in your SQL syntax near ')' at line
1 at ./newnews.pl line 444.
Query failed (INSERT INTO usenet_ref (msgid,ref,cnt,dtm) VALUES ('<09020102.3649
@hotmail.com>','(none)',0+1,)) at ./newnews.pl line 444.

???? Looks like a new one.

One the plus side, my posts make it to the newsgroups just fine!

You seem to be hitting all those unusual situations that should never happen! :D This problem is due the xhdr function not being able to determine the date from the message header. I have added a fix for this.

The other thing you uncovered was the '(none)' in the refs column. This means the message had no references and is a thread starter so there is no need to put it in the refs table as it is already in the thread table. It wasn't doing any harm but it was wasting space.

regards
PAJ
--
the latest version of this hack is always in the post of this thread

Just call me troublemaker ;)

I guess it's good and bad. Helps make your hack compatible with more systems, bad in the fact that I've put you through a bunch of hassle over the past couple of days.

I've installed a second test copy of vbulletin that I'm testing this hack with, and have previewed it to the other forum moderators on our site. They are impressed! Once we get the bugs worked out, it'll be great to roll this out to our users (many of which know nothing about newsgroups!)

I'll give the changes a whirl, and let you know how it goes.

Thanks again for your speedy reply!

When I run the script now, it stops after pulling the first group as follows:

Connecting to News.newsfeeds.com... Connected
Sending authentication info... Authenticated and logged in
Getting article batch from alt.test
Fetching headers of articles 849566 to 849665... done

It doesn't try to parse the articles. I don't think it's traffic related, as this happens each time I try it. I have to control-c to abort the script.

Sorry to be a pain...

I've seen this problem occasionally. It has to do with strange characters in the NNTPFrom header field. Once the batch is downloaded part of the processing is to parse this 'from' header to extract the email and real name. If there is something it doesn't like in here it can take forever to process. I've seen it take 10 minutes before. It does come back eventually though. This is very rare but I will try to determine exactly which characters it has trouble with.

Which newsgroup is it stopping on. I'll try the same one for testing. Try just letting it run for a while to see if it frees up.

regards

It's hanging on alt.test I bumped the article number up in the table to bypass the message, and it got much further along (I think it got to the 5th newsgroup), then bombed again:

DBD::mysql::db do failed: You have an error in your SQL syntax near ''wow $10 fo
r each referals and get paid for reading emails make money','skidrow' at line 1
at ./newnews.pl line 448.
Query failed (INSERT IGNORE INTO usenet_article (newsgroup,forum,msgid,dtm,subje
ct,poster,email,refs,body,msgnum,nntpposter,ord,threadid,postid) VALUES ('alt.fi
shing.catfish',8,'<09020102.3650@hotmail.com>',,'wow $10 for each referals and
get paid for reading emails make money','skidrow1','skidrow1@hotmail.com','(none
)','hay guys check this out this is so cool u can make so much money just by ref
erals and reading emails its owsome its easy money very easy money all u have to
do is take out 5 mins daily and start makeing money \n\n\nhttp://www.inboxcash.
com/$10/referral.asp?id=521016',6731,'skidrow1@hotmail.com',0+1,0,0)) at ./newn
ews.pl line 448.

Once again, got around this by bumping up the article number in the table. Clearly there is a character problem somewhere that the script doesn't like.

That SQL error is caused by not finding the date in the header again. There are two possible places for the date in the header, one is the 'Date' field and the other is the 'NNTPPostingDate'. Which ones it fills in is dependent on the news server and the posting client.

When I was using newsfeeds.com to test this script I was parsing the header myself to find any one of these. However, I rewrote that part to use the xhdr function as it's more reliable. Since the the rewite I haven't tested it on newsfeeds as I no longer have an account there. So it seems that newsfeeds is not consistent in which header field is used for the date.

I will add more logic to try date first (the client posting date), then if that's not there, use the NNTPpostingDate (the server date).

I'll have the fix done in an hour or so.

OK... here ya go.

I've tested this on around 600-700 messages from alt.test using onlynews.com and also a DNEWS server.

You will get a lot of crap from alt.test though as everyone is testing similar stuff to this. There'll also be a lot of binary and mime stuff in there.

let me know how you get on.

Bingo, that did it! Back in business. Got right around the post that was causing the crash. I've setup cron to pull every 30 minutes, so I'll give it a couple of days to see if it trips up any more.

Thank you again so much. If I can send you a contribution for your troubles, drop me an email!

Three days now without an error!!!!! Very well done!!!

I think there may be a problem with the email notification of new replies, but I'm still looking into that.

Cheers!

I think the email reply routine makes use of the mysql_insert_id function that you were having problems with before. I'll have a look and confirm it. If that's the case, it makes sense that you're having problems and I'll come up with a solution for you.

I've also had a play with vB version 2 and as far as I can see there should be no problems making it work. The backend should work with little or no changes. Just the changes to the files that handle new posts and replies will need to be redone. But I'm waiting for the final release before I do anything.

regards

looking forward to it :)

Just for my own time table - what are the current plans on this for the 2.0 release?

I actually thought it was ready this weekend but I came across a problem when dealing with mailing list archives such as the mailing.database.mysql newsgroup. Because these are posted to via an email program rather than a news client, I need to add more functionality to check for parent posts without relying on the X-References header.

At worst it will be out by Sunday, but as long as I have no major set-backs it should be ready mid to late of this week.

You can see how it's coming along at dBforums.com (http://dbforums.com). You may see it filling up and emptying of posts at random while I'm testing.

There will be no control panel access in this release. I'll try and finish that by the final vB 2.0 but I'm not promising.

Some of the extra stuff that will be in this version are:

Handling of quoted MIME printable headers (for all those funny foreign characters)
Correct handling for mailing archive groups
Color coding and indenting of quotes (customizable via replacement vars)
vB style quote to usenet style quote conversions
Usenet posters can be auto-added as special vB users to allow searching and post count tracking (they can easily be converted to real users by changing usergroup).
email notification to vB users of 'replied to' usenet posts (uses existing vB template)
Multiple news server support
plus some other guff that I can't remember.

Would this script be compatible with a mailing list instead of just newsgroups? I have a mailing list archive (1994 to present) I want to put into a forum and then have email to a certain address be piped into a php or perl script to be inserted into the forum. The latter isn't as important since the main mailing list that I want to do is also a newsgroup, but there are a couple other mailing lists I'd like to put into their own forum. Anyways, being able to read a standard mail archive file (like what the mail spool file is) and stuff it into a forum all threaded and such would be nice.

You couldn't use it as is, but it wouldn't be too hard to modify. The script has four main sections at the moment:

Retrieve articles from server
Parse headers, format and spam control
Thread and sort
Insert into forums

2,3 & 4 can be pretty much left alone (maybe some unneeded stuff can even be taken out). Number one just needs to be changed to get the messages from the mail spool. If you don't fancy doing it yourself I will take a look after I've released a stable(ish) version.

will post some "more" :) things are i think of them...

* for usenet posters, somehow the offline/online thing should not happen as it makes no sense. I guess this is an edit in showthread.php based on usergroup.

* questionable value of "report to moderator" in usenet groups. I am not sure if moderating a usenet group is really something feasible or practical so perhaps there is a way to turn these off. I assume their is a field in forum table that says isUsenet type thing, perhaps we can do an if/elseif on report to moderator based on this?

* A way to restrict searches from usenet and non usenet forums. I guess this is an extra function that would be needed inside search. I know it could be done by setting the forumids manually inside search.php but a better way would be to autogenerate this on the isUsenet type field.

all for now...

Originally posted by fastforward
You couldn't use it as is, but it wouldn't be too hard to modify.

Before you came out with this hack, I was planning on writing my own email <-> forum thing anyways... but no point in reinventing the wheel if yours does most of what I need. I haven't look at your script to see if it's what I need yet.

Anyways, for the mail to be processed by the script, it's pretty easy to do on *nix systems with sendmail. Just make an alias in the aliases file that calls the program like:forumlist: "|/path/to/script.pl"
Then the email to forumlist@whatever.com will be passed to the script as STDIN.
Then the path is /path/to/script.pl, but in a lot of systems a security measure is taken so that not just any program can be executed from an alias, so a link in the /etc/smrsh directory to the script needs to be made.

I cannot wait for the V2 version m8, already setup, ready for it togo!

fastforward, did u just reply??

I have been a bit busy at work this past couple of weeks and didn't manage to get all the promised features into this release. But not to worry as vB 2.0 is still only beta.

This release is for vB 2.0 beta3.

One major addition to this release is the ability to have usenet users added as forum users with a special usergroup and title to allow searching by username and display of post numbers. This option is still experimental and is not turned on by default. I do have the option on at dBforums.com (http://dbforums.com) and have had no major problems. You are welcome to try it yourself by setting the option in the script.

One omission from this release is control panel integration. I have been too busy to look at this and it will probably remain at the bottom of the 'to-do' list until all other functionality is in.

See the fist post in this thread for other additions and information.

The instructions for this release were rushed out so may contain errors. (although I did give them a test run and they seemed ok)

Download usenet_gateway.tar.gz (http://britishexpats.com/download/usenet_gateway_v2.0b3.tar.gz)

Regards

Originally posted by etones
fastforward, did u just reply??
Yeah... I got confused while submiting a new post and editing the first post in the thread. I had two browser windows open and got V confused and posted the edit as a new post :confused: so I deleted it and restarted.

fastforward,

This is a massive hack! Thanks for the contribution. This should be great on my forum.

Just noticed the $usenet_usergroup_id=x from the script is not used, and a hardcoded group id of 8 is used in php code replacements. Similar variables are used like that, I think, like usenet user title. Just FYI, I'm still installing it with anticipation.

That's the only one. The problem is there is no way for the perl script to pass the variable to the php. I suppose I should really have set it up in config.php.

Another thing to add to my list. :(

I have a couple of newbie questions.. please be kind :)

> 3. Run the usenet.sql script.

How do I do that?
is it
mysql -uusername -ppasswd mydbname < usenet.sql

One thing that worries me is "... and adds some indexes and columns to existing tables."

Is it possible that there will be problems when I next time/in the future upgrade my vB when this hack has inserted new colums etc to existing tables.. for example if there will be new colums that go "on top" of these usenet-columns (yes, I know nothing about mysql/databases and I may be worrying fo nothing : )

Are there any other possible problems when upgrading to a new vB version besides need to modify all php-files again?

If i install this hack and want to remove it later what should I do?

or

If I install this hack but need to reinstall it after I have downloaded some newsgroups. What should I do to remove the existing news-messages so that I can start download them again?

Thanks!

This hack is NOT for the faint hearted!

You will definately have problems upgrading to a newer version of vB. The table changes are the least of your worries. Any vB code changes to the newreply or newthread files after this hack has been installed will make it impossible to post due to primary key violations. The changes are minor and it's easy to re-apply to any new release but I wouldn't advise you to implement this hack unless you are confident with mysql and php.

I doubt Jelsoft will offer much sympathy if you ask for support after installing a hack :)

As far as emptying out the usenet posts to restart... I have a little script I knocked up for that. I'll post it when I get home this evening.

Ok, thanks for a quick reply.

I used to run UBB so hacking is pretty familiar thing to me and I'm just trying to estimate what risks might be involved here. It seems that at this point this is not for me :(

This hack is just so good that it would be more than great if Jelsoft makes this part of vB at some point in the future.. sooner or later, preferably ASAP! ;)

I hope that there will be some version of this hack that is suitable for us who are not that familiar with php or mysql. (easy to install and maybe some script to uninstall this hack before a vB upgrade and then somehow install it back to the system).

This script will empty the forums of all usenet posts, users and clean up the index tables. It will empty all posts in a usenet forum regardless of who posted it.

Make sure you have a working backup before running this for the first time! :)

What it does:

Deletes all usenet threads from thread table for all usenet forums specified in usenet_group.
Sets the replycount, lastpost details to null for all usenet forums.
Delete all posts that belonged to the threads we just deleted.
Deletes all searchindex entries for posts that we just deleted.
Resets the lastmsg numbers to zero in the usenet_group table.
Empties the usenet_article and usenet_ref tables
Deletes all users with a usertitle of 'Usenet Poster'
Deletes userfield table entries for the users we just deleted

Mmmm.. I have a slight problem with 2.0b3... Everything runs smooth, the only flow is that in normal forums ( the vbb forums) I suddenly get Registered: and Location: field printed twice under username while in Usenet forums it's displayed only once. Went through instructions like 5 times already, everything seems to be just fine. I was just wondering if anyone else had simmilar issue...

fastfoward...

just one thing on the admin script. If you run the news script for a few weeks and then decide to ugrade to purge everything and do a re-install, the newsgroup posts sent via the forums won't be uploaded because it checks to ensure they are not there if you see what i mean. Is there a way to override this on a fresh install?

Originally posted by chrispadfield
fastfoward...

just one thing on the admin script. If you run the news script for a few weeks and then decide to ugrade to purge everything and do a re-install, the newsgroup posts sent via the forums won't be uploaded because it checks to ensure they are not there if you see what i mean. Is there a way to override this on a fresh install?
I thought of this myself while I was trying to figure out the best way of upgrading my boards. The easiest way is to add an option to bypass the check and just load everything in.

This release (2.1) fixes some bugs and made a few efficiency changes.

Files Changed:

config.php
newnews.pl
memberlist.php
newreply.php
functions.php

Changes:

moved $usenetusergroupid to config.php to avoid hard coding the id
added variable $lastactivethread_length in config.php to limit length of thread title if used in templates.
corrected thread and forum replycount problems
will re-import locally originated usenet posts if not already in forum
added $recover_deleted_localposts option to re-import any locally originated posts that are on usenet but no longer in the forums due to being deleted. If this option is on, it will mean legitimately deleted posts will also be re-imported.
fixed problem with outgoing messages being deleted from outgoing table before being sent!
made multiple news server support an explicit option to prevent unnecessary connects/disconnects

The documentation has been updated and there is an upgrade set of instructions if you already installed version 2.0. Download (http://britishexpats.com/download/usenet_gateway_v2.1__20b3.tar.gz) The latest release is also in the first post of this thread

The link you just provided is broken. Can you check it and provided a new link please?

Originally posted by DumbQuestion
The link you just provided is broken. Can you check it and provided a new link please?
Ooops! :o
It's fixed now

Your installation instructions refer to "usenet.sql", but the file you provided for download does not have this file.

It contains "vB_sql_changes.sql". Do we use this file instead?

Originally posted by DumbQuestion
Your installation instructions refer to "usenet.sql", but the file you provided for download does not have this file.

It contains "vB_sql_changes.sql". Do we use this file instead?
Yep. That's the one. I'll correct the docs now.

Have anyone tested this with a free feed? I only need on usenet group (no binary or xxx) to test it and see if it works, I used nezpig for a while, but can't connect to them anymore.

Nice boards!

Man that ubb like forum view is awsome! You should make a hack about it! I'd sure get it!:) (The little icon next to name in all forums listings..)

I am waiting for the next release on the usenet hack to come out along with vb2 beta4.

Question: When is the next usenet hack beta version coming out with the ability to allow only a few members or group of members to post?

Also I have subscribe to newsfeeds.com which I understand will work with this hack?

Thanks and I can't wait to incorporate this awesome hack into my site.....

DON'T USE NEWSFEEDS :)

I was using them but they seem to have added some lovely little singature to every one of their posts automatically saying "Message posted by Newsfeeds.com" but it is a lot bigger and ugglier than that. I tihnk if you use spamkillers server they have it is ok but that server seems to have low article retention (ironically).

I am going to go for someone else instead as it makes posts from your site look very unprofessional. Any suggestions?

You might want to look at onlynews.com. I recently switched from newsfeeds to them. They have no speed cap which is ideal for this particular script. On my dbforums site I can pull in 50 posts in about 8 seconds. This extra speed is welcome when you have 70 groups mirrored.

Too late....I had already paid for the year since it was much cheaper that way. What I would like to do is setup my.newsfeeds.com with the groups that I am interested for my members thus it should be very fast according to newsfeeds.com docs.

When my certain group of members who can post to the usenet forum send it through it will go to spamkiller.newsfeeds.com stripping out the footer that is appended to every post.

I hope it works like that...:)

Any word when the next version will be out? I've added all the modules as requested (I hope) to install and run the hack.

Originally posted by fastforward
You might want to look at onlynews.com. I recently switched from newsfeeds to them. They have no speed cap which is ideal for this particular script. On my dbforums site I can pull in 50 posts in about 8 seconds. This extra speed is welcome when you have 70 groups mirrored.

Fastforward,

I did the SQL and PHP updates, edited vars in newnews.pl but get the following errors:

I installed the 2.1 for beta3

[quote]
Global symbol "$db" requires explicit package name at ./newnews.pl line 46.
Global symbol "$db_host" requires explicit package name at ./newnews.pl line 47.
Global symbol "$db_username" requires explicit package name at ./newnews.pl line 48.

Bareword found where operator expected at ./newnews.pl line 53, near "$smtp_server="localhost"
(Missing operator before localhost?)
String found where operator expected at ./newnews.pl line 65, near "$useragent=""
(Might be a runaway multi-line "" string starting on line 53)
(Missing semicolon on previous line?)
.......

Global symbol "$batch_limit" requires explicit package name at ./newnews.pl line 53.
Global symbol "$expire_days" requires explicit package name at ./newnews.pl line 53.
Global symbol "$italic_quotes" requires explicit package name at ./newnews.pl line 53.
Global symbol "$indent_quotes" requires explicit package name at ./newnews.pl line 53.
Global symbol "$colorize_quotes" requires explicit package name at ./newnews.pl line 53.
Global symbol "$hyperlinked_uri" requires explicit package name at ./newnews.pl line 53.
Global symbol "$allowsmilies" requires explicit package name at ./newnews.pl line 53.
Global symbol "$email_notify" requires explicit package name at ./newnews.pl line 53.
Global symbol "$article_wrap" requires explicit package name at ./newnews.pl line 53.
Global symbol "$max_msg_length" requires explicit package name at ./newnews.pl line 53.
Global symbol "$output_to_console" requires explicit package name at ./newnews.pl line 53.
Global symbol "$useragent" requires explicit package name at ./newnews.pl line 53.
Bareword found where operator expected at ./newnews.pl line 65, near "$useragent="LowCarber"
./newnews.pl has too many errors.

Hmm,

I figuered the above problems. It was due to bad quotes """ (my fault)

Now I can see the articles retrieved, but it fails to add them to musql:


Fetching article body 284512... OK
Fetching article body 284513... OK
Processing article batch...
Requested 150 messages... 1 not available or rejected.

inserting new threads into forums
DBD::mysql::db do failed: Unknown column 'isusenet' in 'field list' at ./newnews
.pl line 595.
Query failed : INSERT IGNORE INTO user (usergroupid,username,password,email,join
date,isusenet,posts,lastvisit,lastactivity,usertitle,customtitle,lastpost) VALUE
S (19,'Luci C.','usenet','bobbiesoxx@aol.comnospam',985074383,1,1,985074383,9850
74383,'Usenet Poster',1,985074383) : at ./newnews.pl line 595.

Did you read onlynews agreement?

DO NOT use automated unattended news posting and retrieval programs with our service.
Improper use of these automated unattended posting and retrieval news programs can
result in termination of your account or interruption of service without notice. We will be
disabling accounts of any customers that abuse the system with these (or other)
commercial, automated unattended news posting and retrieval programs. OnlyNEWS
Usenet account services are designed and priced for individual use and NOT for
commercial level archiving of gigabytes of data. For purposes of this section and purposes
of clarity, abuse of the system includes but is not limited to downloading in excess of 5
gigabytes over a 30-day period. Your individual account is for personal, non commercial
use -- NOT for archiving, newsfeeding, news sucking, or testing experimental software. If
you would like a commercial newsfeed, please contact our sales staff at

Originally posted by tamarian
Hmm,
Query failed : INSERT IGNORE INTO user (usergroupid,username,password,email,join
date,isusenet,posts,lastvisit,lastactivity,usertitle,customtitle,lastpost) VALUE
S (19,'Luci C.','usenet','bobbiesoxx@aol.comnospam',985074383,1,1,985074383,9850
74383,'Usenet Poster',1,985074383) : at ./newnews.pl line 595.

Rename the column in the user table from 'isusenetpost' to 'isusenet'. It seems I made a mistake in the SQL installation script. The hardest bit about this hack has been getting the docs right!

Originally posted by cyo
Did you read onlynews agreement?

You'll see the same restrictive guidelines on most of the decent personal news accounts. Although strictly speaking it could be agued this program probably violates some of these guidelines, if you are simply sucking a selection of text based groups, you are using well below the 'abuse' limit. And this isn't experimental software.... it's a 'release' :)

[i]And this isn't experimental software.... it's a 'release' :) [/B]

hehe....

come on beta4 i want to intall this!

We now have control panel integration!

Some of the other features include:

all variables are now set via the control panel and stored in the database. (except the database connect string obviously)
support for seperate footers per forum in outgoing posts
purge options per forum or all usenet posts. can also be set to purge only non-real user posts.
selectable multi-language characterset support for MIME quoted printable strings
usergroup assignable permission to allow/disallow usenet posting
documentation tidy up. (I have personally gone through the installation using the docs on a vanilla vB several times to clear up any typos/errors etc.
a whole bunch of minor bug fixes


Download (http://britishexpats.com/download/usenet_gateway.tar.gz) version for vB 2.0 beta3

Screeshots of control panel
Image 1 (http://britishexpats.com/download/Image1.gif) Image 2 (http://britishexpats.com/download/Image2.jpg) Image 3 (http://britishexpats.com/download/Image3.jpg) Image 4 (http://britishexpats.com/download/Image4.jpg) Image 5 (http://britishexpats.com/download/Image5.jpg)

I must say this is one of those "out of the box" thinking hacks that is extremely practical and extremely original!!

Quick question though, will the new release make installation and configuration a breeze for n00b's like myself (I wake up every morning and refer to a manual just to get my computer turned on) :)

2.2.20 beta 3... I'm getting stuck on sql updates:

ALTER TABLE post ADD INDEX(tsp);

"Error 1069. Too many keys specified. Max 16 keys allowed"

Anything i could alter in my MySQL settings to fix this?

Originally posted by v0n
2.2.20 beta 3... I'm getting stuck on sql updates:
ALTER TABLE post ADD INDEX(tsp);
"Error 1069. Too many keys specified. Max 16 keys allowed"
Anything i could alter in my MySQL settings to fix this?
As long as the msgid column has a unique index you can leave the other indexes off. You may want to re-think the necessity of all your other indexes though. 16 indexes on the post table seems a little excessive :)

Originally posted by fastforward
16 indexes on the post table seems a little excessive :)

I would say very excessive.... it must take forever to add new posts to the database. Unless you are using all those keys (I doubt you are), I would delete those that aren't being used.

Found where the problem was..
a bug in web interface i use to insert stuff to db. It creates extra indexes each run in the past ( msgid, msgid_1, mdgid_2 etc..). My fault, should have debugged the script...

Btw.. many thanks for this mod, it's fantastic.

Originally posted by claypots
Quick question though, will the new release make installation and configuration a breeze for n00b's like myself (I wake up every morning and refer to a manual just to get my computer turned on) :)
I did toy with the idea of writing an installation script that parses the vB code and replaces it where necessary, but I figured that most people that install this hack would already have other hacks installed that may have already changed parts of the same code. It also relies on Jelsoft not putting up a sneaky release without changing the version number :)

But if there's enough call for it, I'll create an install script.

go for it m8.

Originally posted by etones
go for it m8.

I seconded the motion......

:)

This is just a minor update to the newnews.pl script. The sql statements have been revisited and optimized to make better use of indexes. The 'insert ignore' statement has also been replaced with a function that wraps the sql statement into an eval() to trap errors which will prevent the script dying at a key violation. This allows me to make more use of the mysql_insertid function which is obviously far more efficient than re-looking up the key of inserted records manually.

Basically this means the script will place far less load on the server during news pulls. I now run mine every 10 minutes and I never know it's running.

By the way, I will start an install script this weekend. It will only ever be useful on a fresh copy of vB of a specific version obviously. I will try to update the script within a few days of each official vB release.

I suppose I'll have to write it in PHP so that it can be ran from a browser; I'd much rather do it in Perl. Ah well I'd best go and start reading my PHP for Dummies book :(

Originally posted by fastforward
[B]This is just a minor update to the newnews.pl script. The sql statements have been revisited and optimized to make better use of indexes. The 'insert ignore' statement has also been replaced with a function that wraps the sql statement into an eval() to trap errors which will prevent the script dying at a key violation. This allows me to make more use of the mysql_insertid function which is obviously far more efficient than re-looking up the key of inserted records manually.


Fastforward, thanks for a great hack! I installed on a test directory and it's running great, no problems.

I have a question regarding upgrades to future versions of vb, like 2.0 none beta etc. I have no problems updating php, but I'm pretty green in sql stuff, I can just do the basics, backup, retore, etc.

If I install the latest usenet hack on vb2 beta 3, what would be involved in updating the database to future vb release? I don't mind losing the usenet posts, but don't want to lose members posts and information?

Originally posted by tamarian
If I install the latest usenet hack on vb2 beta 3, what would be involved in updating the database to future vb release? I don't mind losing the usenet posts, but don't want to lose members posts and information?
You should be OK for the most part. The usenet tables are all prefixed by 'usenet_' and it's unlikely they will ever conflict with a vB table. There are some added columns to the vB tables but again, it's unlikely they will conflict as they only have relevence to usenet stuff.

Where you may run into problems is the indexes. It should be fine when vB 2.0 is finally released, but with each beta I see a different set of indexes on the tables. One major ommission on beta 3 is the unique index on the 'username' column in the user table. This is required if you are importing users from usenet so my script adds it. If beta 4 fixes this ommission, you will end up with redundant indexes.

I can't forsee any scenario that would lead to posts or users being lost or removed so you're safe there.

Don't worry too much as I have this installed on my boards and it will be in my interest to make sure it all works for each new release :)

I'm getting this error when I click on Options in the Usent control panel:

[quote
Warning: Offset 0 is invalid for MySQL result index 6 in /home/tamarian/lowcarber-www/forum/admin/db_mysql.php on line 153
[/quote]

Warning: Offset 0 is invalid for MySQL result index 6 in /home/tamarian/lowcarber-www/forum/admin/db_mysql.php on line 153

That's a standard PHP/Mysql Error. It usually means a table or column can't be found. What about the 'Groups' and 'Spam Control'; do they work?

Make sure you have the 'usenet_setting' & 'usenet_settinggroup' tables.

Yes, the other buttons work.

I deleted the 2.3 newnews.pl and installed the 2.2 one on top of the 2.3 hack. I added the groups through phpMyAdmin and things work fine as far as pulling the groups from nntp.

I can't figure which table is missing though. Any tips on what to look for? Would you like a dump of the schema?

When I ran the sql script, it issued an error regarding a specific forum user!! but since the pdate proceded and create the tables, I dismissed that error.

The only tables the php web page uses are 'usenet_setting' and 'usenet_settinggroup'. It doesn't actually have anything else to with the collecting of news. It is purely for setting the options.

Try downloading the latest hack and running the usenet.php from that. If that fails, drop your usenet_setting and usenet_settinggroup tables, run the SQL that creates and populates them (only those two). You'll have to re-enter your options obviously.

I've just ran the sql script on a fresh vB and loaded up the options with no problem.

In the meantime I'll try to reproduce the problem at my end.

I have the latest hack 2.3 downloaded this afternoon.

I ran the usenet.php script and it gave the same error, so I followed your second sugestion, dropping the two tables, and adding them (through the sql script trimmed to only adding those two) and it worked, I was able to update the usenet options, It gave me this warning:


Your changes have been saved.
Warning: Cannot add header information - headers already sent by (output started at /home/tamarian/lowcarber-www/forum/admin/adminfunctions.php:17) in /home/tamarian/lowcarber-www/forum/admin/usenet.php on line 97


But the 2 tables were populated with my entries, and I used the 2.3 newnews.pl which worked fine.

Thanks for your continued help, I have no idea what went wrong, but the drop/load solution worked.

I just got the following db error in replying to a usenet message, I guess it's the ''' in the title:


Database error in vBulletin: Invalid SQL: UPDATE forum SET lastactivethread = LEFT('Atkin's baking mix',35) WHERE forumid = 30
mysql error: You have an error in your SQL syntax near 's baking mix',35) WHERE forumid = 30' at line 1
mysql error number: 1064
Date: Sunday 25th of March 2001 04:26:21 PM
Script: /newreply.php

Looks like I forgot the addslashes() to that. That will have to be a manual edit to the code I'm afraid. I'll have to update the instructions. I don't know why I missed that. It's in my setup but not in the instructions.

There is one occurance in admin/functions.php that you'll need to change.

It's on or around line 869 and reads:

$DB_site->query("UPDATE forum SET lastactivethread = LEFT('".$threadinfo[title]."',$lastactivethread_length) WHERE forumid = ".$threadinfo[forumid]);

Change it to:

$DB_site->query("UPDATE forum SET lastactivethread = LEFT('".addslashes(htmlspecialchars($threadinfo[title]))."',$lastactivethread_length) WHERE forumid = ".$threadinfo[forumid]);

Sorry about that. :o

If you're updating the docs. there's a minor change that may need to be added. In regards to removing the "Registered: " from postbit, I think the Location needs removing as well? It appeared in duplicates after I installed the latest 2.3 usenet hack. Just FYI.

I can't stop saying how great this hack is! :)

Several errors in the documentation have been corrected.

There was on instance where the 'OLD CODE' was, in fact, the 'NEW CODE'.

Another ommission led to the word 'location:' being displayed twice for registered users.

And finally a serious error in admin/functions.php that caused posts to fail if the thread title contained a quote.

Any idea on how long the install script will take? If it will more then a week I will just install it manually.

Thanks for any info.....

Michael

Originally posted by mkilty
Any idea on how long the install script will take? If it will more then a week I will just install it manually.

I haven't actually started it yet... sorry. I've been busy doing a new CV ( that's resume for you yanks :D ). Time for a new job change :)

I'd install it by hand if I were you. It's quite straight forward. It's just replacement of code in about 3 or 4 files.

Maybe Jelsoft will let me put ready hacked files in the members area. That would make it easier. Trouble is, if they do it for one they'll have to do it for everybody.

Originally posted by fastforward


There was on instance where the 'OLD CODE' was, in fact, the 'NEW CODE'.



Which one's that? Not sure if I cought that, or it's screwd up, I installed the 2.3 version of the hack.

The NEW CODE part was still OK. So if you managed to find the correct bit to replace you'll be fine. It was actually the same section as the missing addslashes() problem.

I am having the problem where some replies are not being inserted into the threads that they correspond to and they just stay in the usenet_article table. While a few of them do not have the parent post to be inserted into, the majority do have the thread started and post that it's a reply to is in there, but it fails to insert them into it. These messages end up being about 10% of the total posts from the rec.sport.unicycling newsgroup.

I also tried it with some other newgroups on a server that was not a usenet server and had the same problems with those ones too. Here is the test newsgroup that I tried on that server: news://news.webdiscuss.com/webdiscuss.test

It's not just the thread starter that prevents an article being inserted. Each usenet article has references to every post that was posted before it. An article will only be placed in to the forum when the post immediately before it is available. If it didn't work like this you would get posts appearing as replies to a message that doesn't exist. Very confusing! When you first start pulling news this occurs a lot due to the fact that many of the articles will have expired but replies are still there. As time goes on the surplus will expire from the usenet_article table and the number will reduce. Provided your news provider gets all posts, you will get them too. I am confident this part of the script is working as it is the part I spent longest on to ensure all articles were collected and inserted when appropriate. An example is my dbforums site. It has been running long enough for the orphan messages to fall through the system and I now have about 800 articles in the usenet_article table. This is from 70 usenet groups. So that is the number you can expect after about a week of running (provided your expire is set to 5-7 days) in the control panel.

Let me know if this would explain the numbers you are seeing.

And you are you using the latest version right?

Originally posted by fastforward
It's not just the thread starter that prevents an article being inserted. Each usenet article has references to every post that was posted before it. An article will only be placed in to the forum when the post immediately before it is available.

The post that the message is a direct reply to is there and inserted into the forum already. If you try that test forum that I mentioned, there are three threads and no missing or expired posts yet it happens in there. The posts not appearing in the forum is happening in the real newsgroups as well even though the entire thread exists on the news server with the post that is is a reply to.

Fastforward, my members didn't like the usenet messages being included into the "view new posts" search, as we have a much slower traffic, so I thought it might be useful option in the CP to include or exclude usenet messages from the search new posts function. Just an idea. I intend on hacking it for the mean time to exlude usenet messages.

Originally posted by fastforward
And you are you using the latest version right? Yes, I am using 2.3

Originally posted by tamarian
Fastforward, my members didn't like the usenet messages being included into the "view new posts" search, as we have a much slower traffic, so I thought it might be useful option in the CP to include or exclude usenet messages from the search new posts function. Just an idea. I intend on hacking it for the mean time to exlude usenet messages.
I never really thought about that actually. But you're right that must be very annoying to see all those mesages :) I'll add it ASAP

Gilby, I'm still looking at your problem. In the meantime, have a look in the usenet_article table at the messages that are not being inserted and look at the refs column and the ord column. Then do a search in the post table and see if one of the refs matches an inserted post AND the ord is either equal to or 1 more than an inserted post.

Also, did the missing messages originate from usenet or your forum?

Well I've just tested it on that webdiscuss news server and I managed to get all the messages from that thread with no problems. They didn't all come in in the first run though. This is because it only does one 'insert into forum' batch each time. So although all messages were retrieved and placed into usenet_article during the first run, it took three more runs to insert them correctly in order. The only other explanation is if the references weren't placed in your outgoing messgages correctly. I'll check the code in the newreply.php and newthread.php again.

On another note. I have come across a problem with the control panel if gzip is not enabled or output buffering is not on. You'll get the 'Cant add header' errors. I now see why Jelsoft didn't refresh the control panel each time a change is made. I'll look into fixing that tomorrow.

Now I am puzzled. :confused: I did a clean install of the forums (imported the original DB that had no posts, etc. into a new DB) and imported the rec.sport.unicycling newsgroup and now I only have 15 messages in the usenet_article table and when I look at it, there are reasons that I see that they were not inserted. Stuff like "Re(2):" or "Re: [Re:" are in the subject and some of those have no refs (so the orphan replies code doesn't associate a ref for it). Earlier today, I had 52 messages in the usenet_article table and most looked like they shouldn't have had the problems. I also downloaded the latest zip of the hack and replaced the newnews.pl script with the one I had (which was version 2.3, but from before the addslashes() fix was put into the code replacements which I didn't think affected the newnews.pl file). Another thing I did was set it for just the one newsgroup without using the multiple servers since you mentioned the bug with that (although I never saw that particular bug, and you deleted it from your post). So, now it appears to work. Thanks for looking into this problem and sorry about trouble.

Anyways, I think a feature that would be nice to help with posts not being inserted would be to have a feature in the control panel where the posts that are still in the usenet_article table are listed and then have an option to choose a thread to put it into. This would help with the posts that have weird subject lines since one unrecognized post makes all the child posts get stuck as well.

More reports:

1. I got one duplicate message inserted (has same poster and title) but has no body:

Here's the empty one:

http://www.lowcarber.org/forum/showthread.php?s=&threadid=1379

Here's the one that's not empty:

http://www.lowcarber.org/forum/showthread.php?s=&threadid=1373

2. CP Stats includes usenet posts, which more than tripled my daily stats. This is just cosmetic suggestion to not include usenet posts in stats or give them their own stats.

3. For some reason the 2.3 newnews.pl doesn no pull news articles, it just displays connected and then "clean disconnect" When I run the 2.1 version it polls articles properly. I currently have the 2.3 installation of the hack, but my cron job uses the 2.1 version of newnews.pl.

Originally posted by tamarian
1. I got one duplicate message inserted (has same poster and title) but has no body:
This is probably a duplicate post on usenet aswell. There is nothing that can be done about that. The script should check to see if the message is completely blank and not insert it if it is.

2. CP Stats includes usenet posts, which more than tripled my daily stats. This is just cosmetic suggestion to not include usenet posts in stats or give them their own stats.

Believe it or not, I've never actually looked at that option in the CP. I think I tried it once in beta 1 and it didn't work so I never looked again :) I'll fix that sometime soon.

3. For some reason the 2.3 newnews.pl doesn no pull news articles, it just displays connected and then "clean disconnect" When I run the 2.1 version it polls articles properly. I currently have the 2.3 installation of the hack, but my cron job uses the 2.1 version of newnews.pl.
This should not be happening. You may introduce other problems by using an old version of newnews.pl. Even if there are no messages it should tell you. The only reason I can think of is if all newsgroups are turned off in the control panel. Version 2.1 doesn't check for this but 2.3 does. If this is the problem give yourself a slap! :rolleyes:

Originally posted by Gilby
Another thing I did was set it for just the one newsgroup without using the multiple servers since you mentioned the bug with that (although I never saw that particular bug, and you deleted it from your post).
It turned out the problem was actually caused by the server name not being saved. The default one is saved correctly, but any custom. per newsgroup entries were never inserted into the settings table. This was fixed a few minutes after I posted yesterday. It was just a small edit in the usenet.php file.

Originally posted by fastforward
If this is the problem give yourself a slap! :rolleyes:

Slap! :p

Now I am puzzled. I did a clean install of the forums (imported the original DB that had no posts, etc. into a new DB) and imported the rec.sport.unicycling newsgroup and now I only have 15 messages in the usenet_article table and when I look at it, there are reasons that I see that they were not inserted. Stuff like "Re(2):" or "Re: [Re:" are in the subject and some of those have no refs (so the orphan replies code doesn't associate a ref for it). Earlier today, I had 52 messages in the usenet_article table and most looked like they shouldn't have had the problems.
You might want to keep an eye on this. If there really is a problem, what you are seeing indicates there is something wrong with usenet replies to forum originated messages. Although I've never exprienced any problems, there could be some kind of timing issue that doesn't apply the references correctly.

What happens when a forum message is destined for usenet is this:
[list=1]
A dummy msgid is created in the msgid column to avoid primary key violations
The message is sent to usenet at the next run. (this is the last thing it does after pulling articles. at this stage there is still only the dummy key available which is meaningless)
The next run retrieves the msgid of the posted message and updates the msgid column with the correct message id. (This is is the fisrt thing it does, so theoretically, it should have no problem with the messages it's about to pull)
[/list=1]
So the bottom line is... it shouldn't be a problem, but there may be some scenario I've overlooked that could cause the msgid not to be updated.

The obvious cause is if you change your 'useragent' in the control panel. This is how the script identifies your messages. Once you have set it, you should not change it if you have outstanding messages from usenet.

Yesterday when I tried the newsgroup again from scratch, I did not have any better success than before, as I had thought. What happened was that I had the days to keep orphaned replies set to 7 days, so it deleted the ones that were older than 7 days. So today, I tried it with a setting that would be longer than any of the dates of the messages that are in the newsgroup and there were 69 that could not be inserted into the forums (and 468 that were put in the forums).

Anyways, looking at the ones that did not get inserted, I had these cases:

A thread starting post had two direct replies, both had the reds set to the msgid of the first post in the thread. However, the one that was not inserted had an order of 2 while the parent post had an ord or 0 and the other reply had an ord of 1. So that didn't fit the ref matching along with the ord being equal to or one more that the parent post. So it didn't get inserted. So, I changed the MySQL call in the newnews.pl file to ignore the ord. So on line 404, I removed "((d.ord = a.ord) OR (d.ord +1 = a.ord))"so that I now have: my $q3 = db_fetch("SELECT b.title, a.nntpposter, a.forum, a.msgid, a.dtm, a.subject, a.poster, a.body, a.ord, a.postid, a.email, b.threadid, c.ref FROM usenet_article AS a, thread AS b, usenet_ref AS c, post AS d where b.threadid = d.threadid and b.forumid = $$newsgroup->{forumid} and c.ref = d.msgid and a.msgid = c.msgid ORDER BY a.dtm, a.ord"); That made it where this post was then inserted into the proper thread along with 4 others that had the same problem and brought the total in usenet_article to 64.

The rest of the cases are now without the ord part of the MySQL query.

A message has two refs in it and both those refs are associated with posts in the post table, however, these referenced posts are in different threads. Also, the original message on the newsgroup does not have any references associated with it. The refs associated with this message both had the same title "(no subject)". So the orphaned replies code must have associated this post with both of these messages.

A reply to a message that had a different title than what it's reference was and then the post that was not inserted was a reply to that message that had no refs in the headers. The original post was from a newbie to newsgroups and replied to a message to start a new thread.

One post was a reply to a post, but the reference for that post did not match the parent post (the message did have quotes from the parent post though).

Some posts have weird subject lines associated with it, such as one person's reply to a message titled "My Post" ends up being "Re: [My Post]", and a reply to another replied message ends up being "Re: [Re: My Post]". These posts do not have an references in them on the newsgroup, . Also, there were instances of subjects like "Re(2): My Post" and "Re: Re(2):".

Some posts to the mailing list do not make it to the newsgroup, so the replies to that get trapped in the usenet_article table.

Some posts looked like they matched all the conditions to be inserted, with the refs being associated with the post(s) that id goes to. I went through the MySQL query and everything matched. So I added some console() calls to see if those messages did get selected and they did. So, I added some more console() statements to determine what's happening, so starting at line 440, I put in: console("\nTrying post from $poster:");
if (db_execute("INSERT INTO post (allowsmilie,threadid,username,dateline,pagetext,visible,ord,msgid,userid,ipaddr ess,isusenetpost,seq) VALUES ($config{allowsmilies},$threadid,$poster,$dtm,$fbody,'1',$ord,$msgid,$userid,$nn tpposter,1,$seq+1)",1)) {
console(" posted!");
$postid = $dbh->{'mysql_insertid'};
db_execute("DELETE FROM usenet_article WHERE msgid = $msgid");
db_execute("DELETE FROM usenet_ref WHERE msgid = $msgid");
my $q4 = db_fetch("SELECT lastpost FROM thread WHERE threadid=$threadid");
my ($lastpost) = $q4->fetchrow_array;
if (!$lastpost) { $lastpost = $dtm; }
db_execute("UPDATE thread SET replycount = $seq ".(($dtm >= $lastpost)?",lastpost=$dtm,lastposter=$poster":"")." WHERE threadid=$threadid");
my $q5 = db_fetch("SELECT lastpost FROM forum WHERE forumid=$forumid");
$lastpost = $q5->fetchrow_array;
if (!$lastpost) { $lastpost = $dtm; }
db_execute("UPDATE forum SET replycount=replycount + 1 ".(($dtm >= $lastpost)?",lastpost=$dtm,lastposter=$poster":"")." WHERE forumid=$forumid");
indexpost($postid);
push(@updated_threads,$threadid);
}
else { console($DBI::errstr); }

This resulted in these results:
Getting article batch from rec.sport.unicycling
No new messages in rec.sport.unicycling
inserting new threads into forums
inserting replies into forums

Trying post from 'Jonathan Marsha':Duplicate entry '<CSujlIAW5Mw6Eww0@jbmarshl.demon.co.uk>' for key 5
Trying post from 'Mark Wiggins':Duplicate entry '<3AC0D05A.620D0A71@ftel.co.uk>' for key 5
Trying post from 'Mark Wiggins':Duplicate entry '<3AC0D05A.620D0A71@ftel.co.uk>' for key 5
Trying post from 'Mark Wiggins':Duplicate entry '<3AC0D05A.620D0A71@ftel.co.uk>' for key 5
Trying post from 'Chuck Webb':Duplicate entry '<As6w6.166$9d.54655@newshog.newsread.com>' for key 5
Trying post from 'Greg House':Duplicate entry '<5Xew6.151$pn3.483290@nntp3.onemain.com>' for key 5
Trying post from 'Greg House':Duplicate entry '<5Xew6.151$pn3.483290@nntp3.onemain.com>' for key 5Processing outgoing messages
Clean disconnection from news.tc.umn.edu

Um, ok, now I figured that one out since those posts are already in the forum somehow ended up in the usenet_article table many times.

So in conclusion, the rec.sport.unicycling newsgroup is pretty messed up as far as the posts being properly threaded. The newsgroup is also an email mailing list so that causes some pretty strange things for referenences (such as no references) and the formatting of the subject line.

Originally posted by fastforward
What happens when a forum message is destined for usenet is this:

That isn't the problem that I have with the one newsgroup, rec.sport.unicycling, as I haven't ever posted a message that went to the newsgroup from the forums. It may have been a factor in the other newsgroups that I played around with but didn't focus too much on trying to get them to work.

1. A thread starting post had two direct replies, both had the reds set to the msgid of the first post in the thread. However, the one that was not inserted had an order of 2 while the parent post had an ord or 0 and the other reply had an ord of 1. So that didn't fit the ref matching along with the ord being equal to or one more that the parent post. So it didn't get inserted. So, I changed the MySQL call in the newnews.pl file to ignore the ord. So on line 404, I removed "((d.ord = a.ord) OR (d.ord +1 = a.ord))"so that I now have:

I don't understand how the message with an order of 2 was a direct reply to the parent. If this were the case, it would have an order of 1. This 'ord' field is simply a count of references in the header. The other unlikely scenario is that the news server added two references (but as you pointed out, there was only one ref in each message). I will check the code, but as I'm sure you've seen, not a lot can go wrong with counting how many refs are in the header. Somewhere between the count and the insertion into the usenet_article table it must have lost one of the refs. :confused:

If you leave that clause out permanently, you will get messages out of synch. This will happen more on busy groups that are propogated to many servers.

I thought long and hard about the best way to handle it, and this method with the ord seemed to be the method that fit most situations. In addition to the ord column, there is the 'seq' column this is simply incremented and the thread id displayed on this order. The 'ord' method makes sure the 'seq' 'makes sense'.

As for the other issues, as you probably realize, there's not much we can do about that I don't think. (Except your suggestion of flagging un-inserted messages for manual insertion). If everybody used a good solid news client we'd be fine :)

I just noticed that on the usenet forum, the "Next thread" link doesn't show up properly. I'd keep pressing it for next thread, and I'd think I'm at the last thread since it only shows "Last thread" But if I check the thread list, there are a few more to read.

This happens on and off, they sometimes show up till the last thread?!?

This must simply be a post count problem. My script does nothing to the actual forum display or the way vB handles it. Try updating the threadcounts. I'll have a look at the code and see if there's a problem with incrementing and indexing of the posts in the script.

I've already noticed a small problem with the counting of usenet user posts if you have enabled the 'import user from usenet' option. That will be fixed in the next release.

Originally posted by tamarian
I just noticed that on the usenet forum, the "Next thread" link doesn't show up properly. I'd keep pressing it for next thread, and I'd think I'm at the last thread since it only shows "Last thread" But if I check the thread list, there are a few more to read.

This happens on and off, they sometimes show up till the last thread?!?
I believe I've found the problem. It was updating the replycount in the forum table, but not the threadcount. In addition, it was updating the replycount twice; once for the thread starter and once again for the associated post. I'll post a new release later today along with a few other minor bug fixes.

Originally posted by fastforward
I don't understand how the message with an order of 2 was a direct reply to the parent. If this were the case, it would have an order of 1. This 'ord' field is simply a count of references in the header. The other unlikely scenario is that the news server added two references (but as you pointed out, there was only one ref in each message). I will check the code, but as I'm sure you've seen, not a lot can go wrong with counting how many refs are in the header. Somewhere between the count and the insertion into the usenet_article table it must have lost one of the refs. :confused:

I looked more closely at the actual messages on the newsgroup server and message 3 was a reply to message 2, however, it had no refs in the header. So an order of 2 is correct, but the orphaned replies code only found one reference to associate with it.

Talking about threads, my usenet threads count is always 0 on main page since i have installed 2.3. Post count is correct but thread count doesn't progress.

Also on another note.. user group permissions to post.
- With usenet forum set to 'Open for new posts?' (No) noone can post including user groups allowed to do so.
- With usenet forum set to 'Open for new posts?' (Yes)
everyone can post including canpostusenet = 0 groups BUT only canpostusenet = 1 group propagates to usenet. Confusing bit is that some replies get propagated to usenet, and then canpostusenet = 0 replies appear only on the board. In effect to NNTP users my guys appear to be replying to phantom posts in the thread.

I went through the whole lot few times, can't spot the bug...

Originally posted by v0n
Talking about threads, my usenet threads count is always 0 on main page since i have installed 2.3. Post count is correct but thread count doesn't progress.

Check you have entered the variable name correctly in the templates. It should be '$totalusenetthreads'

Also on another note.. user group permissions to post.
- With usenet forum set to 'Open for new posts?' (No) noone can post including user groups allowed to do so.
- With usenet forum set to 'Open for new posts?' (Yes)
everyone can post including canpostusenet = 0 groups BUT only canpostusenet = 1 group propagates to usenet. Confusing bit is that some replies get propagated to usenet, and then canpostusenet = 0 replies appear only on the board. In effect to NNTP users my guys appear to be replying to phantom posts in the thread.

That's exaclty how it's supposed to work. This way there is no interference with the vB forum permissions. All the canpostusenet flag does is give you one more thing to play with when you're setting up forum permissions. It will not stop anyone posting to a forum (the vB permissions are already available for you to do that). As you have correctly determined, all it does is enable/disable propogation to 'usenet'. I will clarify that in the documentation of the next release.

Updated version - New features & bugfixes

This release adds a few options and fixes some of the bugs discovered over the last few days.


Added option in control panel for eliminating usenet users and posts from the stats.
Added option in control panel for eliminating usenet posts from the 'Get New Posts' search.
Added option in control panel to allow multiple iterations of replies load per batch run to avoid staggling posts left in the article table when pulling history or large batches.
Removed hardcoded regexes that were removing things like 'fred wrote in message <123213@news.news.com>'. This should now be entered in your 'Replacements' in the 'Spam Control' section of the control panel.
Corrected usenet user post count when importing usenet users
Corrected miscounting of replies in the forum table. (this was confusing vB and causing the next thread link to display when it shouldn't)
Disabled the refresh in the control panel after updates to prevent the 'cant send header' errors when gzip or buffered output is disabled.
Prevented orhan replies with the subject of '(no subject)' being threaded incorrectly alongside non-related articles with no subject. They are now not loaded at all unless there is a reference in the header.


Download version 2.4 for vB 2.0b3 (http://britishexpats.com/download/usenet_gateway.tar.gz)

There are upgrade instructions for anyone who has installed v2.3. There are only two places in which to change code and a few SQL statements to run. I'll be around for a few hours if anyone installs it tonight and runs into problems. But it's straightforward enough... trust me :)

Both the newnews.pl and the usenet.php need replacing with this new version

The latest version of this hack will always be in this first post of the thread.

Fastworward, help!

A user asked me about some crap messages shown in a non-usenet forum, lots usenet posts that arenet from the one I pull with things like:

"This message has been automatically sent by Micromuse's Netcool/NNTP
Internet Service Monitor."

and some SPAM etc. dated Feb 12, but added today.

The forum in question is

http://www.lowcarber.org/forum/forumdisplay.php?s=&forumid=2

I'm still using 2.3. Is this a new bug, or a known one fixed in 2.4?

I only use one usenet group as forum #30 and it is populated correcly, while this stuff shows in forum #2.

Don't panic! The only way this can happen is if you have a destination forum for one of your usenet groups set to a real forum.

Just look in your usenet groups option in the control panel and find the offending group. If you're getting automated messages it sounds like it's alt.test or another test group.

If you're absolutely sure it's not showing on your groups options page then look in the usenet_group table for it. But there is no way this can happen by itself. The script has to be told which forum to place messages in.

Hmm,

I've deleted alt.test (was it set by default to forum #2)? I might've had run it by mistake before I set up the only usenet group I carry. My usenet server has a huge retention rate, so feb12 messages soesn't surprise me.

Originally posted by tamarian
Hmm,

I've deleted alt.test (was it set by default to forum #2)? I might've had run it by mistake before I set up the only usenet group I carry. My usenet server has a huge retention rate, so feb12 messages soesn't surprise me. ]
Nope. It was set to forumid 9999. This was done to avoid the very problem you've just ran into. Maybe I'll just leave the test group out altogether next time.

But if the messages have just appeared today, the script must still be pulling them from usenet. Was it in the usenet_group table?

No, my database shows one group only, alt.support.diet.low-carb, and the same in my CP.

I might've just noticed it today when a user reported it, since the retention is hight on y NNTP server, I pulled feb 12 messages in late March, so they appear at the bottom and don't get noticed. That's what comes to mind.

Hmmm, sorry about that. I think I will leave the alt.test group out of the next release... just to be safe. But there really is no way for the script to place messages in any forum other than that specified in the usenet_group table.

Can you clean up the forum easily? It's easy to identify a post as usenet, but it's not easy to determine which forum it's in because there's no forumid in the post table. If there's not too many it's probably safer to do it by hand rather than a query.

Let me know if you need any help.

Oh it could've been a screw up on my part. Remember my earlier post when I was using the 2.1 script on top the 2.3 version of the hack.

Thanks for the offer of help. There are quite a few of them to do by hand.

How can I identify usenet messages. I might be able to do something like

"delete all usenet posts where forum id=2"

If that what you ment by the difficulty due to not having forum id in post table, I can try this:

"delete all usenet posts prior to feb 13"?

There is a column in both the post table and the thread table called 'isusenetpost'. The threads are easy to delete as there is also a 'foumid' column there. But there is no way of knowing which usenet articles from the post table are from the forum that needs cleaning as there is no forumid column.

The cleanest way may be to purge all usenet posts and re-pull the news. You can opt to keep articles that are usenet, but were posted by your users in the control panel if that is a requirement.

I have no problem purging all usenet posts and polling them back again. How can I purge them? Just to make sure I got it right:

Delete all threads where is usenetpost=1
Delete all posts where isusentpost=1

Any other queries I need to do?

You can do it from the control panel in the 'Groups Section'

There is an option at the very top to purge all posts, or you can do it for each group.

But if you want to do it by hand you are correct in what you said. 'where isusenetpost=1'

Oops, that didn't work right. It just deleted the posts from the usenet forum, not from the screwed up one! I'll table by table.

Ah.. yeah.. sorry. I should have realized that. It's another safety feature to only delete from forums that are specified in the usenet_group table. As you've deleted it from that table it ignores it.

Fastforward, I think I found the explanation to what happened. I grep'd on test and found this:

vB_sql_changes.sql:INSERT INTO usenet_group (newsgroup, forum, lastmsg, server,
username, password, enabled, footer) VALUES ( 'alt.test', '2', '0', '', '', '',
'1', '');

I'm not sure if this was 2.1 or 2.3, but one of them probably wasn't 9999. Just in case anyone has the same installation.

Those funny threads were probably sucked in before I deleted alt.test.

Everything's fine now, and I'll install the 2.4

The link for downloading 2.4 doesn't work?

Originally posted by tamarian
The link for downloading 2.4 doesn't work? Doesn't work for me, but it did earlier. Here's a link that works: http://britishexpats.com/download/usenet_gateway_v2.4__20b3.tar.gz

Originally posted by tamarian
Fastforward, I think I found the explanation to what happened. I grep'd on test and found this:

vB_sql_changes.sql:INSERT INTO usenet_group (newsgroup, forum, lastmsg, server,
username, password, enabled, footer) VALUES ( 'alt.test', '2', '0', '', '', '',
'1', '');

I'm not sure if this was 2.1 or 2.3, but one of them probably wasn't 9999. Just in case anyone has the same installation.

Those funny threads were probably sucked in before I deleted alt.test.

Well, I don't kow what to say, except I'm very sorry. I'm guessing I overwrote my original statement that had 9999 when I gave the docs an overhaul. To ensure everything worked I extracted the DDL & DML from my setup using phpMyAdmin. My setup was just for test and I did have the alt.test group pointing to forum #2.

I have now removed the alt.test group completely. I'll also add a note in the docs to double check the groups options page and make sure the forums specified are really where you want the usenet messages to go before runing the script for the first time.

The download link was a good one, but I woke up this morning to find my disk was full. Apparently I no longer have access to /dev/null so all my log files filled up. I guess I'll be on the phone all day to Host Pro :rolleyes:

Has anybody installed 2.4 yet? Is it all working OK?

Originally posted by fastforward
Well, I don't kow what to say, except I'm very sorry. I'm guessing I overwrote my original statement that had 9999 when I gave the docs an overhaul. To ensure everything worked I extracted the DDL & DML from my setup using phpMyAdmin. My setup was just for test and I did have the alt.test group pointing to forum #2.
[/B]

No problem Fastforward, was just a minor inconvenience, nothing compared to the work you've done on this hack, which is the best vb hack in my book:)

With 2.4 i'm actually getting
Parse error: parse error in /usr/local/apache/htdocs/forum/admin/functions.php on line 871

Fatal error: Call to undefined function: vbdate() in /usr/local/apache/htdocs/forum/admin/sessions.php on line 326

in both admin/ and forum

It's the
$DB_site->query("UPDATE forum SET lastactivethread = LEFT('"..addslashes(htmlspecialchars($threadinfo[title]))."',$lastactivethread_length) WHERE forumid = ".$threadinfo[forumid]);
line...

I wrote a script that let's you insert an archived newsgroup or mailing list into the usenet tables. I wrote it so that I could import all the messages since the newsgroup was created and so that I could feed the new messages in from two sources, the mailing list and the newsgroup. Feeding it in from the two sources, I hope will eliminate my missing posts problem for the ones that don't make it into the newsgroup.

This script will read mail in from a file or from an email alias and then parses it to be included in the database. Then the messages will be inserted with the newnews.pl script provided in the Usenet Gateway hack. This script does not make use of any of the spam/binary filtering or other replacement strings that occur before it's inserted.

How to use this script:

From the command line, execute the script like:
./mail2forum.pl < filename2messages.txt
That will insert all the messages in the text file into the database. The text file needs to conform to the mail standards, so it's the same format as the sendmail spool files and the same as what many other email programs use. (Basicly, the beginning of a message is signified by a line beginning with "From " followed by the headers, then an empty line, and then the body).

This script can also be used to insert a message sent to an email address right into the database. To do this, you'll need to create an alias in the sendmail aliases (usually at /etc/aliases) to execute the script. The alias would look like this:
forum: "|/path/to/mail2forum.pl"
Most systems are set up so that not just any script can be executed by sendmail for security reasons. Usually, you'll need to create a symbolic link to the file in a special directory (as root). In the systems I've used, this directory has been "/etc/smrsh/" or "/usr/adm/sm.bin/".

You'll need to rename the file to have the .pl extension.

Replace that line with the following:

$DB_site->query("UPDATE forum SET lastactivethread = LEFT('".addslashes(htmlspecialchars($threadinfo[title]))."',$lastactivethread_length) WHERE forumid = ".$threadinfo[forumid]);

It's the extra dot in front of the addslashes that's causing the problem.

Sorry about that. I've updated the docs.

It won't affect anyone who did an upgrade. The error wasn't there in the previous versions. Only those who have done a fresh install of 2.4 will see this error.

Archive/email to Forum!
I wrote a script that let's you insert an archived newsgroup or mailing list into the usenet tables. I wrote it so that I could import all the messages since the newsgroup was created and so that I could feed the new messages in from two sources, the mailing list and the newsgroup. Feeding it in from the two sources, I hope will eliminate my missing posts problem for the ones that don't make it into the newsgroup.

Thanks for this Gilby. I can actually make use of this myself. I wanted to include the php.net mailing list archive into my forums. It doesn't comply with the usual usenet style mail archives so this is ideal!

In my outgoing usenet messages I use the email address at forums@britishexpats.com. This means I get a whole bunch of replies to forum users that have posted questions sent to my default email. The header actually looks like this:

fastforward <forums@britishexpats.com>

So the user is easily identifiable. Does anyone think it's worth adding something that will collect all these messages, run them through the spam filter and forward them on to the member? Or is it likely to cause members to become irate if a bit of spam leaks through?

how about forwarding them to the user's pm box with the following bit added at the top:

"This was sent to you as a reply to a newsgroup post.. we can not accepted responsibility etc..."

Chris... I thought you were dead! Where have you been? On holiday?

I never thought of using the PM box. thats not a bad idea.

Originally posted by fastforward
Does anyone think it's worth adding something that will collect all these messages, run them through the spam filter and forward them on to the member? Or is it likely to cause members to become irate if a bit of spam leaks through?

I have the email address set up to be an autoresponder to let the person know that that message was not received. Since the newsgroup that I am doing it on is also an email list, a lot of people just hit the reply all button to respond, so those really don't need to be passed on to the person it's being responded to. The autoresponder also puts in a little plug for my site :D and tells them to go to the site to send a PM.

Anyways, I think the best option (probably more complicated to do too :() would be to have an option in the profile on whether they want to have the email sent to them to be forwarded to an address they specify, put in their PMs, or neither.

Originally posted by fastforward

Thanks for this Gilby. I can actually make use of this myself. I wanted to include the php.net mailing list archive into my forums. It doesn't comply with the usual usenet style mail archives so this is ideal!

I'm glad it's useful for you! If you want to add it to the distribution of the usenet hack (and possibly modify it further), feel free to do so.

I'm kind of curious....what kind of a load does this impose on a server?

It really depends on how many groups you are pulling and how busy they are. On one site I have about 12 groups. When that script runs you don't even notice it. It may as well not be there. On my other site I am pulling in 70 groups. With this one the load gets to 1.0 - 1.5 as it gets near the end of the run. Most of the work is spent building that damn searchindex table! I wish it just go away! :) Without indexing the posts the load would not be worth talking about.

I created this Manual Insertion hack that integrates into the vB admin control panel. This hack will let you select messages that are in the que to be inserted but don't have references matching any of the existing messages and are therefore stuck in the usenet_article table. Once selecting the messages, you can do the following:
Delete Start a new thread Merge selected messages together into one thread Insert post(s) into existing thread
All actions, except delete, take effect once the newnews.pl script is executed.

The installation is pretty simple, there is one SQL query that needs to be run and then two additions to to the usenet.php file. If you have any questions or find a bug, please feel free to ask.

fastforward, feel free to include this in the Usenet Gateway hack distribution or do anything else you want to with it.

I updated the mail2forum.pl script, the current one can be downloaded here (http://www.vbulletin.com/forum/attachment.php?s=&attachmentid=416) or above.

Improvements Better at finding references in messages from mail programs that don't use the normal "References:" header and checks the "In-reply-to:" header which was commonly used many years ago.
Improved getting all the references in the usenet_ref table, which it didn't do before.
Checks date and doesn't allow a date to be in the future.
Looks for "Posted-Date:" header as a higher priority than "Date:" as some posts didn't have a year in the "Date:" header (and did in the Posted-Date: header).

This fix entails two more code changes, both in 'postings.php'.

It corrects a duplicate bug when trying to move a thread with a redirect, or copying a thread.

No changes are required in any of the other usenet files or tables.

The changes are the last two (20 & 21) in vB_code_changes.txt.

Here's the changes to save downloading again:

20) postings.php - line 333

OLD CODE
--------
$DB_site->query("INSERT INTO thread (threadid,title,lastpost,forumid,pollid,open,replycount,postusername,postuserid, lastposter,dateline,views,iconid,visible) VALUES (NULL,'".addslashes($threadinfo[title])."','".addslashes($threadinfo[lastpost])."','".addslashes($threadinfo[forumid])."','".addslashes($threadinfo[threadid])."',10,'".addslashes($threadinfo[replycount])."','".addslashes($threadinfo[postusername])."','".addslashes($threadinfo[postuserid])."','".addslashes($threadinfo[lastposter])."','".addslashes($threadinfo[dateline])."','".addslashes($threadinfo[views])."','".addslashes($threadinfo[iconid])."','".addslashes($threadinfo[visible])."')");


NEW CODE
--------
// START USENET HACK
$DB_site->query("INSERT INTO thread (threadid,title,lastpost,forumid,pollid,open,replycount,postusername,postuserid, lastposter,dateline,views,iconid,visible,isusenetpost,regpost,msgid) VALUES (NULL,'".addslashes($threadinfo[title])."','".addslashes($threadinfo[lastpost])."','".addslashes($threadinfo[forumid])."','".addslashes($threadinfo[threadid])."',10,'".addslashes($threadinfo[replycount])."','".addslashes($threadinfo[postusername])."','".addslashes($threadinfo[postuserid])."','".addslashes($threadinfo[lastposter])."','".addslashes($threadinfo[dateline])."','".addslashes($threadinfo[views])."','".addslashes($threadinfo[iconid])."','".addslashes($threadinfo[visible])."','$threadinfo[isusenetpost]','$threadinfo[regpost]','$threadinfo[threadid]')");
// END USENET HACK


21) postings.php - line 341

OLD CODE
--------
$DB_site->query("INSERT INTO thread (threadid,title,lastpost,forumid,pollid,open,replycount,postusername,postuserid, lastposter,dateline,views,iconid,notes,visible) VALUES (NULL,'".addslashes($threadinfo[title])."','".addslashes($threadinfo[lastpost])."','".addslashes($threadinfo[forumid])."','".addslashes($threadinfo[pollid])."','".addslashes($threadinfo[open])."','".addslashes($threadinfo[replycount])."','".addslashes($threadinfo[postusername])."','".addslashes($threadinfo[postuserid])."','".addslashes($threadinfo[lastposter])."','".addslashes($threadinfo[dateline])."','".addslashes($threadinfo[views])."','".addslashes($threadinfo[iconid])."','".addslashes($threadinfo[notes])."','".addslashes($threadinfo[visible])."')");
$newthreadid=$DB_site->insert_id();

$DB_site->query("UPDATE thread SET notes='".addslashes($threadinfo[notes])."',forumid='".addslashes($forumid)."' WHERE threadid='$threadid'");

$posts=$DB_site->query("SELECT * FROM post WHERE threadid='$threadid'");
while ($post=$DB_site->fetch_array($posts)) {
$DB_site->query("INSERT INTO post (postid,threadid,username,userid,title,dateline,attachmentid,pagetext,allowsmili e,showsignature,ipaddress,iconid,visible,edituserid,editdate) VALUES (NULL,'$newthreadid','".addslashes($post[username])."','".addslashes($post[userid])."','".addslashes($post[title])."','".addslashes($post[dateline])."','".addslashes($post[attachmentid])."','".addslashes($post[pagetext])."','".addslashes($post[allowsmilie])."','".addslashes($post[showsignature])."','".addslashes($post[ipaddress])."','".addslashes($post[iconid])."','".addslashes($post[visible])."','".addslashes($post[edituserid])."','".addslashes($post[editdate])."')");
}


NEW CODE
--------

// START USENET HACK
$DB_site->query("INSERT INTO thread (threadid,title,lastpost,forumid,pollid,open,replycount,postusername,postuserid, lastposter,dateline,views,iconid,notes,visible,isusenetpost,regpost,msgid) VALUES (NULL,'".addslashes($threadinfo[title])."','".addslashes($threadinfo[lastpost])."','".addslashes($threadinfo[forumid])."','".addslashes($threadinfo[pollid])."','".addslashes($threadinfo[open])."','".addslashes($threadinfo[replycount])."','".addslashes($threadinfo[postusername])."','".addslashes($threadinfo[postuserid])."','".addslashes($threadinfo[lastposter])."','".addslashes($threadinfo[dateline])."','".addslashes($threadinfo[views])."','".addslashes($threadinfo[iconid])."','".addslashes($threadinfo[notes])."','".addslashes($threadinfo[visible])."','$threadinfo[isusenetpost]','$threadinfo[regpost]','".$threadinfo[threadid] . addslashes($threadinfo[msgid])."')");
$newthreadid=$DB_site->insert_id();
$DB_site->query("UPDATE thread SET notes='".addslashes($threadinfo[notes])."',forumid='".addslashes($forumid)."' WHERE threadid='$threadid'");
$posts=$DB_site->query("SELECT * FROM post WHERE threadid='$threadid'");
while ($post=$DB_site->fetch_array($posts)) {
$DB_site->query("INSERT INTO post (postid,threadid,username,userid,title,dateline,attachmentid,pagetext,allowsmili e,showsignature,ipaddress,iconid,visible,edituserid,editdate,msgid) VALUES (NULL,'$newthreadid','".addslashes($post[username])."','".addslashes($post[userid])."','".addslashes($post[title])."','".addslashes($post[dateline])."','".addslashes($post[attachmentid])."','".addslashes($post[pagetext])."','".addslashes($post[allowsmilie])."','".addslashes($post[showsignature])."','".addslashes($post[ipaddress])."','".addslashes($post[iconid])."','".addslashes($post[visible])."','".addslashes($post[edituserid])."','".addslashes($post[editdate])."','$post[postid]')");
}
// END USENET HACK

To allow upgrade to beta 4 before I get the new release out you must do the following:

Delete the unique indexes on the msgid column in the post table and the thread table.

The new usenet gateway release will not require these indexes. However, to avoid any chance of duplicate usenet messages in the meantime, make sure you don't alter the message numbers in the control panel.

After upgrading to beta 4, you should be able to continue collecting news, but you won't be able to post outgoing messages.

I'll get the new release out in a day or two.

This is a drop in replacement for newnews.pl. It will allow you to continue collecting news without fear of duplicate posts.

Before upgrading to vB beta 4 you should drop the unique indexes on the msgid column in the post and thread tables. Then create the new usenet_loader table. I have included an SQL script that does this for you.

This new 'newnews.pl' script will eliminates the need for them.

Fastforward, thanks for the beta4 preparations!

I almost forgot. You should also run these two sql statements to create non-unique indexes for the ones you just deleted. These aren't essential, but the columns are used in joins during a news load. You will eliminate full table scans and reduce server load during news pulls if you create them.


ALTER TABLE post ADD INDEX(msgid);
ALTER TABLE thread ADD INDEX(msgid);

This release eliminates the need for unique indexes on the msgid columns in the post and thread tables. This will mean there should be no problems when upgrading to future versions of vB (Other than the inability to post news obviously).

All the code changes have been revisited and changed where necessary for beta 4.1. The beta 4.1 version this was tested with is the one dated 4/4/01 at 4:49pm. There have been at least two builds with different timestamps since 4.1 was released.

The link in the first post of this has been updated to point to the latest version (http://britishexpats.com/download/usenet_gateway.tar.gz)

Let me know if you have any problems.

I think if you release a new version (e.g. 4.0 after 3.x) then you should open a new thread, becuase for me it was confusing to read all the problems from Version 1.x to now.

I read through this whole thing, but there is one point I missed - please explain to a total usegroup newbie: Do I have to pay for usegroup access? (somebody mentioned something like that)
If so, which services are out there?

Thanks

Originally posted by robertusss
I read through this whole thing, but there is one point I missed - please explain to a total usegroup newbie: Do I have to pay for usegroup access? (somebody mentioned something like that)
If so, which services are out there? It depends on what your webhosting provider offers. If they are dial up users, then you will probably be able to use their news server at no charge. You may be able to get through from your ISP as well by using your username and password to access that news server. If you don't have one of those, then you'll have to find one of the services that you'd need to pay to use... though, I think there are a few free ones out there.

I found a bug happens when a post is made on the forums and sent to the newsgroup. Then, when the post is fetched, the postid and threadid that is sent in the headers of the message are read in, but gets associated with the post that has the message number one more than the forum post. This results in the post in the forum getting the wrong message-id asigned to it, and then the post right after the forum post can't be inserted because the message-id already exists.

If the "Re-import Local Posts" is set to yes, then the forum post gets duplicated in the forums, otherwise it just gets deleted from usenet_article (or never inserted to begin with).

This occured in version 2.4 for vB 2.0 beta 3 and I upgraded everything to the new versions (2.5 and vB 2.0 beta 4.1) without any other modifications to the scripts.

I tried to look at the newnews.pl script but I was unable to see where there was anything wrong (though I don't follow that programming style all that well :rolleyes: ). I also noticed that when the newnews.pl file was run, and there were 2 new posts on the server, it would get 3 posts. This would be the case any time there were new posts where it would get one more than what was new, and if there were no new posts, it would fetch no posts. I thought this might have some relevance.

I think I saw that when using the 'post' in the NNTP module, it returns the message-id, so that may be a different route to use for getting the message-id.

Originally posted by Gilby
I found a bug happens when a post is made on the forums and sent to the newsgroup. Then, when the post is fetched, the postid and threadid that is sent in the headers of the message are read in, but gets associated with the post that has the message number one more than the forum post. This results in the post in the forum getting the wrong message-id asigned to it, and then the post right after the forum post can't be inserted because the message-id already exists.

I saw that before. I thought I'd fixed it... but obviously not.

If the "Re-import Local Posts" is set to yes, then the forum post gets duplicated in the forums

I don't see how that can be possible. It can't insert an article if the msgid already exists. It's obviously a direct result of the bug above. (Or is that your point?)

I tried to look at the newnews.pl script but I was unable to see where there was anything wrong (though I don't follow that programming style all that well :rolleyes: ).

What are you trying to say? What's wrong with the programming style? eh?! ;)

I also noticed that when the newnews.pl file was run, and there were 2 new posts on the server, it would get 3 posts. This would be the case any time there were new posts where it would get one more than what was new, and if there were no new posts, it would fetch no posts. I thought this might have some relevance.

This is probably just a problem with the 'for loop'. It derives it's count from the last message number in the forum, the batch limit in the control panel and the last number reported by the newsserver.

I think I saw that when using the 'post' in the NNTP module, it returns the message-id, so that may be a different route to use for getting the message-id.
This is the way I expected it to work. That doesn't seem to be the case though. I couldn't get it to return anything. It may be a newsserver issue but even CPAN makes no mention of a return value. If you can confirm it returns the msgid, let me know. it would make things a lot easier and more reliable.

By the way; I've got another version of the script ready. This one has a few efficiency changes but the big change is that it now continues to load replies in the correct order until there are no more left. This avoids having to make multiple runs when collecting history and should also prevent the stray articles that may not get loaded. Here's the zip if you want to play with it. I've got it running on my server. I'll package it up properly after I've looked at these other bugs and beta 5 comes out.

The one big problem with the script at the moment is the orphan reply checking. It does a full table scan for each newsgroup every run. That means for my 70 newsgroups I'm doing 70 full tablescans! My poor little Freedom 300 can barely keep up. As the post table fills up, it gets worse (I've got 23,000 posts at the moment). I need to find another way to do it or maybe make it an option for individual newsgroups. Although you get a few orhans in every newsgroup, it's really only a big problem with the mailing archives. Any ideas?

Hi fastforward,

Can I replace the old ./newnews.pl with this new one and if so are there any changes I need to make besides just replacing the file?

Thanks,

Michael

Does spam filter work during posting? I'm a little affraid that opening posting features could attract spammers and get my news server blacklisted?

Originally posted by fastforward
I don't see how that can be possible. It can't insert an article if the msgid already exists. It's obviously a direct result of the bug above. (Or is that your point?)


That's the point... one of the results of this bug.

What are you trying to say? What's wrong with the programming style? eh?! ;)

I wouldn't say there is anything wrong.... just not a style that I can follow very well (with those variable variables and such). I am not all that good at Perl.


This is the way I expected it to work. That doesn't seem to be the case though. I couldn't get it to return anything. It may be a newsserver issue but even CPAN makes no mention of a return value. If you can confirm it returns the msgid, let me know. it would make things a lot easier and more reliable.


I saw it on a page from a google search. I don't know which module it was (I've seen both News::NNTPClient and Net::NNTP which apear to have the same functions). On CPAN, neither of these modules mention anything about returning the message-id. :(

Although you get a few orhans in every newsgroup, it's really only a big problem with the mailing archives. Any ideas?
I did a lot of manual insertion to minimize how many articles were in the usenet_article table, then when I gave up for the itme being on inserting those posts, I renamed the table and created a blank one. Though that's just a temporary fix. Another idea is when not importing old archives, to limit the subjects it can match to those threads that have posts from the last 30 days in them. Maybe even make another table with just the subjects and refs of the threads from the last 30 days.

Originally posted by v0n
Does spam filter work during posting? I'm a little affraid that opening posting features could attract spammers and get my news server blacklisted?No it doesn't. I was thinking of adding it though. It's only a minor change so I'll add it in the next release with beta 5.

Originally posted by mkilty
Can I replace the old ./newnews.pl with this new one and if so are there any changes I need to make besides just replacing the file?

You should be able to do that. provided you deleted the indexes and added the extra table from the latest full release? If not then it won't work. Let me know if you need help.

Originally posted by fastforward
This fix entails two more code changes, both in 'postings.php'.


Fastforward, your 2.5 vb code change instructions don't have any of these postings.php changes. Are they still needed?

Originally posted by tamarian
Fastforward, your 2.5 vb code change instructions don't have any of these postings.php changes. Are they still needed?
No... not needed because the unique constraint has been removed from the msgid columns in the post and thread tables.

Make sure you deleted the unique indexes and re-created normal indexes on those columns.

This release is for vB 2.0 beta 5

Upgrading:

The only vB code changes that differ to the previous version are in showthread.php. The way the Registered: and Location: bits are handled (via custom fields) make it messy to try to hide via code. If you removed the words from your templates in a previous version, you'll have to put it back again as this release doesn't attempt to hide it. If you really need them hiding, the best bet is to create a seperate style set for usenet forums. I'll look at adding options to handle it via the control panel in a later release.
No DDL changes have been made.
There is one DML change that deletes an unnecessary option. (The 'reply iterations' option that is now handled automatically).

This is all listed in UPGRADE.txt

INSTALL.txt details a fresh installation on a virgin vB 2.0 beta 5.

Download (http://britishexpats.com/download/usenet_gateway.tar.gz)

All links in this thread have been updated to point to the new version. The latest version will always be in the first post of this thread.

I hope to get another release of newnews.pl out over the weekend that will add some of the outstanding features listed at the top of this thread. This should simply be a drop in replacement.

Originally posted by Gilby
It depends on what your webhosting provider offers. If they are dial up users, then you will probably be able to use their news server at no charge. You may be able to get through from your ISP as well by using your username and password to access that news server. If you don't have one of those, then you'll have to find one of the services that you'd need to pay to use... though, I think there are a few free ones out there.

does anybody know some of the free ones?

You're so fast forward :) I only knew about beta5 from your post!

wow it amazes me how far this hack has come - it is a program in its own right now! Brilliant.

I can't wait to install, going to wait until stable vbeta comes out - the upgrading is killing me, i am only on beta3 i think the thought of all the template changes gives me sleepless nights :)

Cheers so much for the hack - so cool!

I totally agree!

I'm dead impressed with this script - it has added a great deal to the forums on my site.

Keep up the good work...just waiting for VB 2 to come out of beta so I can stop upgrading this damn thing!!

I'm trying to learn more about this hack before implmenting it on my own Forums.

My understanding is that any post that is made in the newgroup that you have "registered" to will show up as a post in that particular Forum in your Forums? Thos posts are by "unregistered" users who posted through the newgroup?

Also, usenet newsgroups receive a ton of spam/junkmail from people advertising everything from get rich quick schemes to porn sites. Is there a way to keep these kinds of posts out of your forum/usenet gateway???

There is fantastic, fully customisable spam filter built into the hack to prevent getting spam from --> newsgroups.

I've run into a problem...

I've just upgraded my VB to 2.0b5 and at the same time upgraded to Usenet Gateway 2.6 for b5...

Now all seemed to go well until I started pulling new posts again...i've got an extract below, where the newnews.pl just hangs after checking for orphans...the weird thing is that top still reports that mysqld is running and pulling quite a load on the server, so something is happening!...HELP

Connecting to goliath.newsfeeds.com... Connected
Sending authentication info... Authenticated and logged in
Getting article batch from rec.scuba.marketplace
Fetching headers of articles 138 to 153...
Requesting article 139...
Requesting article 140...
Requesting article 141...
Requesting article 142...
Requesting article 143...
Requesting article 144...
Requesting article 145...
Requesting article 146...
Requesting article 147...
Requesting article 148...
Requesting article 149...
Requesting article 150...
Requesting article 151...
Requesting article 152...
Requesting article 153... Fetching article body 139... OK
Fetching article body 140... OK
Fetching article body 141... OK
Fetching article body 142... OK
Fetching article body 143... OK
Fetching article body 144... OK
Fetching article body 145... OK
Fetching article body 146... OK
Fetching article body 147... OK
Fetching article body 148... OK
Fetching article body 149... OK
Fetching article body 150... OK
Fetching article body 151... OK
Fetching article body 152... OK
Fetching article body 153... OK
Processing article batch...
Requested 15 messages... 2 not available or rejected.

inserting new threads into forums
finding replies...
checking for orphans...

What version of the usenet hack were you running before? Did your previous version have the orhan checking routine?

If it looks like MySQL is doing something, it most probably is. The orphan checking routine is quite intensive if there are a lot of articles in the usenet_article table.

Assuming you didn't have the orphan reply check in your previous version:
1) If there are a lot of records in the usenet_article table (over 1,000), you can try putting an index on the 'title' column in the thread table temporarily. You should also investigate why there are so many records if this is the case. Make sure your 'expire' setting is set to something sensible in the control panel. It should be 2 to 5 days.

If your previous version did have the orphan checking routine:
The change in this version is the use of LOW_PRIORITY on the insert statement of that routine. It will take longer than usual, but I must admit, on my shared virtual server I did not notice any difference.

Finally, you may want to experiment by removing the STRAIGHT_JOIN hint from line 407 of newnews.pl. This hint forces MySQL to join the tables in the same order as the select clause. It works faster on my setup with that hint, but you never know, try removing it and see what happens.

Other than that, I don't know what to suggest. You're not getting an error message and that particular routine is doing nothing fancy. Just a bunch of querie on the title field of the thread table.

Let me know how you get on.

Stephan

Did the script ever finish? Did you kill it? Is it working now? Did your server meltdown? :)

Originally posted by robertusss


does anybody know some of the free ones?

A simple search on google resulted in finding this list: http://freenews.maxbaud.net/

They have a lot listed there, the free servers are probably not going to be very reliable though. You should try to get one close to where your server is located as it will be faster. There are lots of other lists out there too.

Fastforward,

Sorry for the delay in replying, I (quite stupidly) did the upgrade quite late in the evening and when I encountered the problems just killed the whole process and re-started it again this morning...

Anyway...

I was running 2.4 for B3, but I downloaded the 2.5 and 2.6 versions of the usenet hack and applied each SQL upgrade instructions seperately - so as if I had upgraded to 2.5 then 2.6...

I checked the usenet_article table and found over 3000 articles in there...since this is a beta test for my users I decided to empty that table and all other tables apart from the tables that seem to have settings in them.

I re-ran the script on the one newsgroup (that I know only has a few hundred posts in anyway) and the whole thing went really quickly...i'm doing some more testing and starting to re-populate the forums now so will post more details on how I get on later...:D

Just an update...

I'm still reloading articles into my forums...(I run 12 newsgroups off vB and some of them are pretty active...)

It looks like i'm almost there and no problems seem to have been encountered yet. I can't tell how much time the Orphaned Check adds to my inserts but i'll do some tests.

Finally, how often are people running cron jobs to update their boards? I was setting mine to around one per hour...

I'm running mine once an hour too. With the new code that ensures all possible replies are inserted at each batch, it's no longer necessary to stay 'ahead' of the newsgroup. (As long as you update more often than your expire date is set to :) )

I've found some more bugs I'm afraid. The main one has to do with the regular expression parsing of the spam filters. I believe I've also fixed the problem Gilby brought up regarding forum generated posts. There was also a small issue that could cause posts to be displayed in the wrong order if a thread contained both usenet posts and forum posts.

The good news is they're all fixed and I've also added an auto purge function.

I'll release it as soon as I've done a bit more testing. In the meantime, if your regular expressions don't appear to work, don't worry, it's not a result of your regex skills. :)

This sounds really interesting, but I dont install hacks because I like to upgrade and we all know its not fun with hacks installed.

My question is, is there any chance this will make it into 2.1? Also would you be willing to let John use the code so they can get this rolling?

I have heard talk about usenet being added before but I wasnt sure when it would take place.

Thanks
~Chris

Sorry, i didn't read the whole thread so please don't kill me if anyone has asked this before:
Is there a way to run the script on a server without having access to telnet and cron? Perhaps triggering the script every time the forums title page is loaded or something like that. i only have FTP access to my webspace account :(
I already contaced the sysob of my server but he said that he won't install the perl script for me :(

Originally posted by LtData
Sorry, i didn't read the whole thread ...

like I said earlier, you should open new threads when a new version comes out (from 1.x to 2.x) and have a moderator close the old thread...

- LTdata

options:

1. Get a new host. If you are paying more than $10/month there is no way you should be without telnet access. I think you need to install some perl modules to get it working anyway (see first post) so without telnet access you would not be able to get it to work.

2. Try the telnet emulators out there, try www.tucows.com to see if one will do what you want.

Jelsoft inclusion in 2.1

Unfortunatly i doubt they will do this, mainly because it is a flipping hard script to support - just look at this thread! but it would certainly be cool if they did. I can't wait until the final so I upgrade and then install this hack first thing!

This release adds a couple of new features and fixes a few bugs.
New Features:

Added auto purge feature. (configurable via control panel)
Added Gilby's mod to manually review/insert orphan replies (Thanks Gilby)
Added additional option in spam & replace filters to allow operation in multi-line mode ('s' switch). There should be *nothing* you can't filter out of your messages now!

Bug Fixes:

Fixed problem with escaping of spam and replace filters when treated as regular expressions
Fixed problem of forum posts being sent without msgid of parent post if parent was also a forum post.
Tidied up indexes (again)
Fixed display ordering problem when usenet and forum posts are in same thread
A few other little things that forgot to write down.


Download version 2.7 for vB 2.0 beta 5 (http://britishexpats.com/download/usenet_gateway.tar.gz)

To upgrade from 2.6 you will need to run some SQL statements and make additional code changes. These are detailed in UPGRADE.txt.

You really should open a new thread....why post multiple versions in the same thread????

Originally posted by TechTalk
You really should open a new thread....why post multiple versions in the same thread????
Because then there would be 20 or 30 different threads floating around. At least this way everything is one place. You can guarantee people would be posting in older threads if there were more than one. If people come across an old thread in a search or something, how do they know there's a new one? They then have to go searching backwards through all the hacks making sure they are reading the latest thread.

As mentioned in frequent posts, the latest version is always in the first post of this thread along with description, screenshots etc.

New readers to the thread will read the first post, see this statement then, if they want, they can read backwards from the end to see what's been happening recently.

Every time i have tried to view the examples or download the file, both http://britishexpats.com and http://www.dbforums.com are down :(

~Chris










privacy (GDPR)